Human Compatible

Human Compatible
	Hardcover edition
Author	Stuart J. Russell
Country	United States
Language	English
Subject	AI control problem
Genre	Non-fiction
Publisher	Viking
Publication date	October 8, 2019
Pages	352
ISBN	978-0-525-55861-3
OCLC	1083694322

Last updated December 02, 2023

Human Compatible: Artificial Intelligence and the Problem of Control is a 2019 non-fiction book by computer scientist Stuart J. Russell. It asserts that the risk to humanity from advanced artificial intelligence (AI) is a serious concern despite the uncertainty surrounding future progress in AI. It also proposes an approach to the AI control problem.

Summary

Russell begins by asserting that the standard model of AI research, in which the primary definition of success is getting better and better at achieving rigid human-specified goals, is dangerously misguided. Such goals may not reflect what human designers intend, such as by failing to take into account any human values not included in the goals. If an AI developed according to the standard model were to become superintelligent, it would likely not fully reflect human values and could be catastrophic to humanity. Russell asserts that precisely because the timeline for developing human-level or superintelligent AI is highly uncertain, safety research should be begun as soon as possible, as it is also highly uncertain how long it would take to complete such research.

Russell argues that continuing progress in AI capability is inevitable because of economic pressures. Such pressures can already be seen in the development of existing AI technologies such as self-driving cars and personal assistant software. Moreover, human-level AI could be worth many trillions of dollars. Russell then examines the current debate surrounding AI risk. He offers refutations to a number of common arguments dismissing AI risk and attributes much of their persistence to tribalism—AI researchers may see AI risk concerns as an "attack" on their field. Russell reiterates that there are legitimate reasons to take AI risk concerns seriously and that economic pressures make continued innovation in AI inevitable.

Russell then proposes an approach to developing provably beneficial machines that focus on deference to humans. Unlike in the standard model of AI, where the objective is rigid and certain, this approach would have the AI's true objective remain uncertain, with the AI only approaching certainty about it as it gains more information about humans and the world. This uncertainty would, ideally, prevent catastrophic misunderstandings of human preferences and encourage cooperation and communication with humans. Russell concludes by calling for tighter governance of AI research and development as well as cultural introspection about the appropriate amount of autonomy to retain in an AI-dominated world.

Russell's three principles

Russell lists three principles to guide the development of beneficial machines. He emphasizes that these principles are not meant to be explicitly coded into the machines; rather, they are intended for human developers. The principles are as follows:^[1]^: 173

1. The machine's only objective is to maximize the realization of human preferences.
2. The machine is initially uncertain about what those preferences are.
3. The ultimate source of information about human preferences is human behavior.

The "preferences" Russell refers to "are all-encompassing; they cover everything you might care about, arbitrarily far into the future."^[1]^: 173 Similarly, "behavior" includes any choice between options,^[1]^: 177 and the uncertainty is such that some probability, which may be quite small, must be assigned to every logically possible human preference.^[1]^: 201

Russell explores inverse reinforcement learning, in which a machine infers a reward function from observed behavior, as a possible basis for a mechanism for learning human preferences.^[1]^{: 191–193}

Reception

Several reviewers agreed with the book's arguments. Ian Sample in The Guardian called it "convincing" and "the most important book on AI this year".^[2] Richard Waters of the Financial Times praised the book's "bracing intellectual rigour".^[3] Kirkus Reviews endorsed it as "a strong case for planning for the day when machines can outsmart us".^[4]

The same reviewers characterized the book as "wry and witty",^[2] or "accessible"^[4] due to its "laconic style and dry humour".^[3] Matthew Hutson of the Wall Street Journal said "Mr. Russell's exciting book goes deep while sparkling with dry witticisms".^[5] A Library Journal reviewer called it "The right guide at the right time".^[6]

James McConnachie of The Times wrote "This is not quite the popular book that AI urgently needs. Its technical parts are too difficult, and its philosophical ones too easy. But it is fascinating and significant."^[7]

By contrast, Human Compatible was criticized in its Nature review by David Leslie, an Ethics Fellow at the Alan Turing Institute; and similarly in a New York Times opinion essay by Melanie Mitchell. One point of contention was whether superintelligence is possible. Leslie states Russell "fails to convince that we will ever see the arrival of a 'second intelligent species'",^[8] and Mitchell doubts a machine could ever "surpass the generality and flexibility of human intelligence" without losing "the speed, precision, and programmability of a computer".^[9] A second disagreement was whether intelligent machines would naturally tend to adopt so-called "common sense" moral values. In Russell's thought experiment about a geoengineering robot that "asphyxiates humanity to deacidify the oceans", Leslie "struggles to identify any intelligence". Similarly, Mitchell believes an intelligent robot would naturally tend to be "tempered by the common sense, values and social judgment without which general intelligence cannot exist".^[10]^[11]

The book was longlisted for the 2019 Financial Times/McKinsey Award.^[12]

Related Research Articles

Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of humans or animals. It may also refer to the corresponding field of study, which develops and studies intelligent machines, or to the intelligent machines themselves.

Eliezer S. Yudkowsky is an American artificial intelligence researcher and writer on decision theory and ethics, best known for popularizing ideas related to friendly artificial intelligence, including the idea that there might not be a "fire alarm" for AI. He is the founder of and a research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California. His work on the prospect of a runaway intelligence explosion influenced philosopher Nick Bostrom's 2014 book Superintelligence: Paths, Dangers, Strategies.

Friendly artificial intelligence is hypothetical artificial general intelligence (AGI) that would have a positive (benign) effect on humanity or at least align with human interests or contribute to fostering the improvement of the human species. It is a part of the ethics of artificial intelligence and is closely related to machine ethics. While machine ethics is concerned with how an artificially intelligent agent should behave, friendly artificial intelligence research is focused on how to practically bring about this behavior and ensuring it is adequately constrained.

Nick Bostrom is a Swedish philosopher at the University of Oxford known for his work on existential risk, the anthropic principle, human enhancement ethics, whole brain emulation, superintelligence risks, and the reversal test. He is the founding director of the Future of Humanity Institute at Oxford University.

Stuart Jonathan Russell is a British computer scientist known for his contributions to artificial intelligence (AI). He is a professor of computer science at the University of California, Berkeley and was from 2008 to 2011 an adjunct professor of neurological surgery at the University of California, San Francisco. He holds the Smith-Zadeh Chair in Engineering at University of California, Berkeley. He founded and leads the Center for Human-Compatible Artificial Intelligence (CHAI) at UC Berkeley. Russell is the co-author with Peter Norvig of the authoritative textbook of the field of AI: Artificial Intelligence: A Modern Approach used in more than 1,500 universities in 135 countries.

A superintelligence is a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds. "Superintelligence" may also refer to a property of problem-solving systems whether or not these high-level intellectual competencies are embodied in agents that act in the world. A superintelligence may or may not be created by an intelligence explosion and associated with a technological singularity.

The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artificial general intelligence. MIRI's work has focused on a friendly AI approach to system design and on predicting the rate of technology development.

An AI takeover is a hypothetical scenario in which artificial intelligence (AI) becomes the dominant form of intelligence on Earth, as computer programs or robots effectively take control of the planet away from the human species. Possible scenarios include replacement of the entire human workforce, takeover by a superintelligent AI, and the popular notion of a robot uprising. Stories of AI takeovers are very popular throughout science fiction. Some public figures, such as Stephen Hawking and Elon Musk, have advocated research into precautionary measures to ensure future superintelligent machines remain under human control.

The following outline is provided as an overview of and topical guide to artificial intelligence:

In the field of artificial intelligence (AI) design, AI capability control proposals, also referred to as AI confinement, aim to increase our ability to monitor and control the behavior of AI systems, including proposed artificial general intelligences (AGIs), in order to reduce the danger they might pose if misaligned. However, capability control becomes less effective as agents become more intelligent and their ability to exploit flaws in human control systems increases, potentially resulting in an existential risk from AGI. Therefore, the Oxford philosopher Nick Bostrom and others recommend capability control methods only as a supplement to alignment methods.

Machine ethics is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherwise known as artificial intelligent agents. Machine ethics differs from other ethical fields related to engineering and technology. Machine ethics should not be confused with computer ethics, which focuses on human use of computers. It should also be distinguished from the philosophy of technology, which concerns itself with the grander social effects of technology.

<i>Superintelligence: Paths, Dangers, Strategies</i> 2014 book by Nick Bostrom

Superintelligence: Paths, Dangers, Strategies is a 2014 book by the philosopher Nick Bostrom. It explores how superintelligence could be created and what its features and motivations might be. It argues that superintelligence, if created, would be difficult to control, and that it could take over the world in order to accomplish its goals. The book also presents strategies to help make superintelligences whose goals benefit humanity. It was particularly influential for raising concerns about existential risk from artificial intelligence.

Instrumental convergence is the hypothetical tendency for most sufficiently intelligent beings to pursue similar sub-goals, even if their ultimate goals are quite different. More precisely, agents may pursue instrumental goals—goals which are made in pursuit of some particular end, but are not the end goals themselves—without ceasing, provided that their ultimate (intrinsic) goals may never be fully satisfied.

Existential risk from artificial general intelligence is the hypothesis that substantial progress in artificial general intelligence (AGI) could result in human extinction or an irreversible global catastrophe.

In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards humans' intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system pursues some objectives, but not the intended ones.

<i>Life 3.0</i> 2017 book by Max Tegmark on artificial intelligence

Life 3.0: Being Human in the Age of Artificial Intelligence is a 2017 non-fiction book by Swedish-American cosmologist Max Tegmark. Life 3.0 discusses artificial intelligence (AI) and its impact on the future of life on Earth and beyond. The book discusses a variety of societal implications, what can be done to maximize the chances of a positive outcome, and potential futures for humanity, technology and combinations thereof.

Many scholars believe that advances in artificial intelligence, or AI, will eventually lead to a semi-apocalyptic post-scarcity economy where intelligent machines can outperform humans in nearly, if not every, domain. The questions of what such a world might look like, and whether specific scenarios constitute utopias or dystopias, are the subject of active debate.

The Center for Human-Compatible Artificial Intelligence (CHAI) is a research center at the University of California, Berkeley focusing on advanced artificial intelligence (AI) safety methods. The center was founded in 2016 by a group of academics led by Berkeley computer science professor and AI expert Stuart J. Russell. Russell is known for co-authoring the widely used AI textbook Artificial Intelligence: A Modern Approach.

<i>Artificial Intelligence: A Guide for Thinking Humans</i> 2019 book by Melanie Mitchell

Artificial Intelligence: A Guide for Thinking Humans is a 2019 nonfiction book by Santa Fe Institute professor Melanie Mitchell. The book provides an overview of artificial intelligence (AI) technology, and argues that people tend to overestimate the abilities of artificial intelligence.

Artificial intelligence agents sometimes misbehave due to faulty objective functions that fail to adequately encapsulate the programmers' intended goals. The misaligned objective function may look correct to the programmer, and may even perform well in a limited test environment, yet may still produce unanticipated and undesired results when deployed.

References

1 2 3 4 5 Russell, Stuart (October 8, 2019). Human Compatible: Artificial Intelligence and the Problem of Control . United States: Viking. ISBN 978-0-525-55861-3. OCLC 1083694322.
1 2 Sample, Ian (October 24, 2019). "Human Compatible by Stuart Russell review – AI and our future". The Guardian .
1 2 Waters, Richard (18 October 2019). "Human Compatible — can we keep control over a superintelligence?". www.ft.com. Retrieved 23 February 2020.
1 2 "HUMAN COMPATIBLE | Kirkus Reviews". Kirkus Reviews . 2019. Retrieved 23 February 2020.
↑ Hutson, Matthew (November 19, 2019). "'Human Compatible' and 'Artificial Intelligence' Review: Learn Like a Machine". The Wall Street Journal .
↑ Hahn, Jim (2019). "Human Compatible: Artificial Intelligence and the Problem of Control". Library Journal. Retrieved 23 February 2020.
↑ McConnachie, James (October 6, 2019). "Human Compatible by Stuart Russell review — an AI expert's chilling warning". The Times .
↑ Leslie, David (2019-10-02). "Raging robots, hapless humans: the AI dystopia". Nature. 574 (7776): 32–33. Bibcode:2019Natur.574...32L. doi:10.1038/d41586-019-02939-0.
↑ Mitchell, Melanie (2019-10-31). "Opinion | We Shouldn't be Scared by 'Superintelligent A.I.'". The New York Times. ISSN 0362-4331 . Retrieved 2023-07-18.
↑ Leslie, David (2 October 2019). "Raging robots, hapless humans: the AI dystopia". Nature. 574 (7776): 32–33. Bibcode:2019Natur.574...32L. doi:10.1038/d41586-019-02939-0.
↑ Mitchell, Melanie (October 31, 2019). "We Shouldn't be Scared by 'Superintelligent A.I.'". The New York Times .
↑ Hill, Andrew (11 August 2019). "Business Book of the Year Award 2019 — the longlist". www.ft.com. Retrieved 23 February 2020.

External links

Interview with Stuart J. Russell

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[HC-1] 1 2 3 4 5 Russell, Stuart (October 8, 2019). Human Compatible: Artificial Intelligence and the Problem of Control . United States: Viking. ISBN 978-0-525-55861-3. OCLC 1083694322.

[sample-2] 1 2 Sample, Ian (October 24, 2019). "Human Compatible by Stuart Russell review – AI and our future". The Guardian .

[waters-3] 1 2 Waters, Richard (18 October 2019). "Human Compatible — can we keep control over a superintelligence?". www.ft.com. Retrieved 23 February 2020.

[kirkus-4] 1 2 "HUMAN COMPATIBLE | Kirkus Reviews". Kirkus Reviews . 2019. Retrieved 23 February 2020.

[hutson-5] Hutson, Matthew (November 19, 2019). "'Human Compatible' and 'Artificial Intelligence' Review: Learn Like a Machine". The Wall Street Journal .

[6] Hahn, Jim (2019). "Human Compatible: Artificial Intelligence and the Problem of Control". Library Journal. Retrieved 23 February 2020.

[mcconnachie-7] McConnachie, James (October 6, 2019). "Human Compatible by Stuart Russell review — an AI expert's chilling warning". The Times .

[8] Leslie, David (2019-10-02). "Raging robots, hapless humans: the AI dystopia". Nature. 574 (7776): 32–33. Bibcode:2019Natur.574...32L. doi:10.1038/d41586-019-02939-0.

[9] Mitchell, Melanie (2019-10-31). "Opinion | We Shouldn't be Scared by 'Superintelligent A.I.'". The New York Times. ISSN 0362-4331 . Retrieved 2023-07-18.

[10] Leslie, David (2 October 2019). "Raging robots, hapless humans: the AI dystopia". Nature. 574 (7776): 32–33. Bibcode:2019Natur.574...32L. doi:10.1038/d41586-019-02939-0.

[mitchell-11] Mitchell, Melanie (October 31, 2019). "We Shouldn't be Scared by 'Superintelligent A.I.'". The New York Times .

[12] Hill, Andrew (11 August 2019). "Business Book of the Year Award 2019 — the longlist". www.ft.com. Retrieved 23 February 2020.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

v t e Existential risk from artificial intelligence
Concepts	AGI AI alignment AI capability control AI safety AI takeover Consequentialism Effective accelerationism Ethics of artificial intelligence Existential risk from artificial general intelligence Friendly artificial intelligence Instrumental convergence Intelligence explosion Longtermism Machine ethics Suffering risks Superintelligence Technological singularity
Organizations	Alignment Research Center Center for AI Safety Center for Applied Rationality Center for Human-Compatible Artificial Intelligence Centre for the Study of Existential Risk EleutherAI Future of Humanity Institute Future of Life Institute Google DeepMind Humanity+ Institute for Ethics and Emerging Technologies Leverhulme Centre for the Future of Intelligence Machine Intelligence Research Institute OpenAI
People	Scott Alexander Sam Altman Yoshua Bengio Nick Bostrom Paul Christiano Eric Drexler Sam Harris Stephen Hawking Dan Hendrycks Geoffrey Hinton Bill Joy Shane Legg Elon Musk Steve Omohundro Huw Price Martin Rees Stuart J. Russell Jaan Tallinn Max Tegmark Frank Wilczek Roman Yampolskiy Eliezer Yudkowsky
Other	Statement on AI risk of extinction Human Compatible Open letter on artificial intelligence (2015) Our Final Invention The Precipice Superintelligence: Paths, Dangers, Strategies Do You Trust This Computer? Artificial Intelligence Act
Category