Praise for Human Compatible:
“This is the most important book I have read in quite some time. It lucidly explains how the coming age of artificial super-intelligence threatens human control. Crucially, it also introduces a novel solution and a reason for hope.” —Daniel Kahneman, winner of the Nobel Prize and author of Thinking, Fast and Slow
“A must-read: this intellectual tour-de-force by one of AI's true pioneers not only explains the risks of ever more powerful artificial intelligence in a captivating and persuasive way, but also proposes a concrete and promising solution.” —Max Tegmark, author of Life 3.0
“A thought-provoking and highly readable account of the past, present and future of AI . . . Russell is grounded in the realities of the technology, including its many limitations, and isn’t one to jump at the overheated language of sci-fi . . . If you are looking for a serious overview to the subject that doesn’t talk down to its non-technical readers, this is a good place to start . . . [Russell] deploys a bracing intellectual rigour . . . But a laconic style and dry humour keep his book accessible to the lay reader.” —Financial Times
“A carefully written explanation of the concepts underlying AI as well as the history of their development. If you want to understand how fast AI is developing and why the technology is so dangerous, Human Compatible is your guide.” —TechCrunch
“Sound[s] an important alarm bell . . . Human Compatible marks a major stride in AI studies, not least in its emphasis on ethics. At the book’s heart, Russell incisively discusses the misuses of AI.” —Nature
“An AI expert’s chilling warning . . . Fascinating, and significant . . . Russell is not warning of the dangers of conscious machines, just that superintelligent ones might be misused or might misuse themselves.” —The Times (UK)
“An excellent, nuanced history of the field.” —The Telegraph (UK)
“A brillantly clear and fascinating exposition of the history of computing thus far, and how very difficult true AI will be to build.” —The Spectator (UK)
“Human Compatible made me a convert to Russell's concerns with our ability to control our upcoming creation—super-intelligent machines. Unlike outside alarmists and futurists, Russell is a leading authority on AI. His new book will educate the public about AI more than any book I can think of, and is a delightful and uplifting read.” —Judea Pearl, Turing Award-winner and author of The Book of Why
“Stuart Russell has long been the most sensible voice in computer science on the topic of AI risk. And he has now written the book we've all been waiting for -- a brilliant and utterly accessible guide to what will be either the best or worst technological development in human history.” —Sam Harris, author of Waking Up and host of the Making Sense podcast
“This beautifully written book addresses a fundamental challenge for humanity: increasingly intelligent machines that do what we ask but not what we really intend. Essential reading if you care about our future.” —Yoshua Bengio, winner of the 2019 Turing Award and co-author of Deep Learning
“Authoritative [and] accessible . . . A strong case for planning for the day when machines can outsmart us.” —Kirkus Reviews
“The right guide at the right time for technology enthusiasts seeking to explore the primary concepts of what makes AI valuable while simultaneously examining the disconcerting aspects of AI misuse.” —Library Journal
“The same mix of de-mystifying authority and practical advice that Dr. Benjamin Spock once brought to the care and raising of children, Dr. Stuart Russell now brings to the care, raising, and yes, disciplining of machines. He has written the book that most—but perhaps not all—machines would like you to read.” —George Dyson, author of Turing's Cathedral
“Persuasively argued and lucidly imagined, Human Compatible offers an unflinching, incisive look at what awaits us in the decades ahead. No researcher has argued more persuasively about the risks of AI or shown more clearly the way forward. Anyone who takes the future seriously should pay attention.” —Brian Christian, author of Algorithms to Live By
“A book that charts humanity's quest to understand intelligence, pinpoints why it became unsafe, and shows how to course-correct if we want to survive as a species. Stuart Russell, author of the leading AI textbook, can do all that with the wealth of knowledge of a prominent AI researcher and the persuasive clarity and wit of a brilliant educator.” —Jann Tallinn, co-founder of Skype
“Can we coexist happily with the intelligent machines that humans will create? ‘Yes,’ answers Human Compatible, ‘but first . . .’ Through a brilliant reimagining of the foundations of artificial intelligence, Russell takes you on a journey from the very beginning, explaining the questions raised by an AI-driven society and beautifully making the case for how to ensure machines remain beneficial to humans. A totally readable and crucially important guide to the future from one of the world's leading experts.” —Tabitha Goldstaub, co-founder of CognitionX and Head of the UK Government's AI Council
“Stuart Russell, one of the most important AI scientists of the last 25 years, may have written the most important book about AI so far, on one of the most important questions of the 21st century: How to build AI to be compatible with us. The book proposes a novel and intriguing solution for this problem, while offering many thought-provoking ideas and insights about AI along the way. An accessible and engaging must-read for the developers of AI and the users of AI—that is, for all of us.” —James Manyika, chairman and director of McKinsey Global Institute
“In clear and compelling language, Stuart Russell describes the huge potential benefits of artificial Intelligence, as well as the hazards and ethical challenges. It's especially welcome that a respected leading authority should offer this balanced appraisal, avoiding both hype and scaremongering.” —Lord Martin Rees, Astronomer Royal and former President of the Royal Society
Stuart Russell is a professor of Computer Science and holder of the Smith-Zadeh Chair in Engineering at the University of California, Berkeley. He has served as the Vice-Chair of the World Economic Forum's Council on AI and Robotics and as an advisor to the United Nations on arms control. He is the author (with Peter Norvig) of the definitive and universally acclaimed textbook on AI, Artificial Intelligence: A Modern Approach.
A leading artificial intelligence researcher lays out a new approach to AI that will enable us to coexist successfully with increasingly intelligent machines
In the popular imagination, superhuman artificial intelligence is an approaching tidal wave that threatens not just jobs and human relationships, but civilization itself. Conflict between humans and machines is seen as inevitable and its outcome all too predictable.
In this groundbreaking book, distinguished AI researcher Stuart Russell argues that this scenario can be avoided, but only if we rethink AI from the ground up. Russell begins by exploring the idea of intelligence in humans and in machines. He describes the near-term benefits we can expect, from intelligent personal assistants to vastly accelerated scientific research, and outlines the AI breakthroughs that still have to happen before we reach superhuman AI. He also spells out the ways humans are already finding to misuse AI, from lethal autonomous weapons to viral sabotage.
If the predicted breakthroughs occur and superhuman AI emerges, we will have created entities far more powerful than ourselves. How can we ensure they never, ever, have power over us? Russell suggests that we can rebuild AI on a new foundation, according to which machines are designed to be inherently uncertain about the human preferences they are required to satisfy. Such machines would be humble, altruistic, and committed to pursue our objectives, not theirs. This new foundation would allow us to create machines that are provably deferential and provably beneficial.
If We Succeed
A long time ago, my parents lived in Birmingham, England, in a house near the university. They decided to move out of the city and sold the house to David Lodge, a professor of English literature. Lodge was by that time already a well-known novelist. I never met him, but I decided to read some of his books: Changing Places and Small World. Among the principal characters were fictional academics moving from a fictional version of Birmingham to a fictional version of Berkeley, California. As I was an actual academic from the actual Birmingham who had just moved to the actual Berkeley, it seemed that someone in the Department of Coincidences was telling me to pay attention.
One particular scene from Small World struck me: The protagonist, an aspiring literary theorist, attends a major international conference and asks a panel of leading figures, "What follows if everyone agrees with you?" The question causes consternation, because the panelists had been more concerned with intellectual combat than ascertaining truth or attaining understanding. It occurred to me then that an analogous question could be asked of the leading figures in AI: "What if you succeed?" The field's goal had always been to create human-level or superhuman AI, but there was little or no consideration of what would happen if we did.
A few years later, Peter Norvig and I began work on a new AI textbook, whose first edition appeared in 1995. The book's final section is titled "What If We Do Succeed?" The section points to the possibility of good and bad outcomes but reaches no firm conclusions. By the time of the third edition in 2010, many people had finally begun to consider the possibility that superhuman AI might not be a good thing-but these people were mostly outsiders rather than mainstream AI researchers. By 2013, I became convinced that the issue not only belonged in the mainstream but was possibly the most important question facing humanity.
In November 2013, I gave a talk at the Dulwich Picture Gallery, a venerable art museum in south London. The audience consisted mostly of retired people-nonscientists with a general interest in intellectual matters-so I had to give a completely nontechnical talk. It seemed an appropriate venue to try out my ideas in public for the first time. After explaining what AI was about, I nominated five candidates for "biggest event in the future of humanity":
1. We all die (asteroid impact, climate catastrophe, pandemic, etc.).
2. We all live forever (medical solution to aging).
3. We invent faster-than-light travel and conquer the universe.
4. We are visited by a superior alien civilization.
5. We invent superintelligent AI.
I suggested that the fifth candidate, superintelligent AI, would be the winner, because it would help us avoid physical catastrophes and achieve eternal life and faster-than-light travel, if those were indeed possible. It would represent a huge leap-a discontinuity-in our civilization. The arrival of superintelligent AI is in many ways analogous to the arrival of a superior alien civilization but much more likely to occur. Perhaps most important, AI, unlike aliens, is something over which we have some say.
Then I asked the audience to imagine what would happen if we received notice from a superior alien civilization that they would arrive on Earth in thirty to fifty years. The word pandemonium doesn't begin to describe it. Yet our response to the anticipated arrival of superintelligent AI has been . . . well, underwhelming begins to describe it. (In a later talk, I illustrated this in the form of the email exchange shown in figure 1.) Finally, I explained the significance of superintelligent AI as follows: "Success would be the biggest event in human history . . . and perhaps the last event in human history."
From: Superior Alien Civilization
Be warned: we shall arrive in 30-50 years
To: Superior Alien Civilization
Subject: Out of office: Re: Contact
Humanity is currently out of the office. We will respond to your message when we return.
Figure 1: Probably not the email exchange that would follow the first contact by a superior alien civilization.
A few months later, in April 2014, I was at a conference in Iceland and got a call from National Public Radio asking if they could interview me about the movie Transcendence, which had just been released in the United States. Although I had read the plot summaries and reviews, I hadn't seen it because I was living in Paris at the time, and it would not be released there until June. It so happened, however, that I had just added a detour to Boston on the way home from Iceland, so that I could participate in a Defense Department meeting. So, after arriving at Boston's Logan Airport, I took a taxi to the nearest theater showing the movie. I sat in the second row and watched as a Berkeley AI professor, played by Johnny Depp, was gunned down by anti-AI activists worried about, yes, superintelligent AI. Involuntarily, I shrank down in my seat. (Another call from the Department of Coincidences?) Before Johnny Depp's character dies, his mind is uploaded to a quantum supercomputer and quickly outruns human capabilities, threatening to take over the world.
On April 19, 2014, a review of Transcendence, co-authored with physicists Max Tegmark, Frank Wilczek, and Stephen Hawking, appeared in the Huffington Post. It included the sentence from my Dulwich talk about the biggest event in human history. From then on, I would be publicly committed to the view that my own field of research posed a potential risk to my own species.
How Did We Get Here?
The roots of AI stretch far back into antiquity, but its "official" beginning was in 1956. Two young mathematicians, John McCarthy and Marvin Minsky, had persuaded Claude Shannon, already famous as the inventor of information theory, and Nathaniel Rochester, the designer of IBM's first commercial computer, to join them in organizing a summer program at Dartmouth College. The goal was stated as follows:
The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.
Needless to say, it took much longer than a summer: we are still working on all these problems.
In the first decade or so after the Dartmouth meeting, AI had several major successes, including Alan Robinson's algorithm for general-purpose logical reasoning and Arthur Samuel's checker-playing program, which taught itself to beat its creator. The first AI bubble burst in the late 1960s, when early efforts at machine learning and machine translation failed to live up to expectations. A report commissioned by the UK government in 1973 concluded, "In no part of the field have the discoveries made so far produced the major impact that was then promised." In other words, the machines just weren't smart enough.
My eleven-year-old self was, fortunately, unaware of this report. Two years later, when I was given a Sinclair Cambridge Programmable calculator, I just wanted to make it intelligent. With a maximum program size of thirty-six keystrokes, however, the Sinclair was not quite big enough for human-level AI. Undeterred, I gained access to the giant CDC 6600 supercomputer at Imperial College London and wrote a chess program-a stack of punched cards two feet high. It wasn't very good, but it didn't matter. I knew what I wanted to do.
By the mid-1980s, I had become a professor at Berkeley, and AI was experiencing a huge revival thanks to the commercial potential of so-called expert systems. The second AI bubble burst when these systems proved to be inadequate for many of the tasks to which they were applied. Again, the machines just weren't smart enough. An AI winter ensued. My own AI course at Berkeley, currently bursting with over nine hundred students, had just twenty-five students in 1990.
The AI community learned its lesson: smarter, obviously, was better, but we would have to do our homework to make that happen. The field became far more mathematical. Connections were made to the long-established disciplines of probability, statistics, and control theory. The seeds of today's progress were sown during that AI winter, including early work on large-scale probabilistic reasoning systems and what later became known as deep learning.
Beginning around 2011, deep learning techniques began to produce dramatic advances in speech recognition, visual object recognition, and machine translation-three of the most important open problems in the field. By some measures, machines now match or exceed human capabilities in these areas. In 2016 and 2017, DeepMind's AlphaGo defeated Lee Sedol, former world Go champion, and Ke Jie, the current champion-events that some experts predicted wouldn't happen until 2097, if ever.
Now AI generates front-page media coverage almost every day. Thousands of start-up companies have been created, fueled by a flood of venture funding. Millions of students have taken online AI and machine learning courses, and experts in the area command salaries in the millions of dollars. Investments flowing from venture funds, national governments, and major corporations are in the tens of billions of dollars annually-more money in the last five years than in the entire previous history of the field. Advances that are already in the pipeline, such as self-driving cars and intelligent personal assistants, are likely to have a substantial impact on the world over the next decade or so. The potential economic and social benefits of AI are vast, creating enormous momentum in the AI research enterprise.
What Happens Next?
Does this rapid rate of progress mean that we are about to be overtaken by machines? No. There are several breakthroughs that have to happen before we have anything resembling machines with superhuman intelligence.
Scientific breakthroughs are notoriously hard to predict. To get a sense of just how hard, we can look back at the history of another field with civilization-ending potential: nuclear physics.
In the early years of the twentieth century, perhaps no nuclear physicist was more distinguished than Ernest Rutherford, the discoverer of the proton and the "man who split the atom." Like his colleagues, Rutherford had long been aware that atomic nuclei stored immense amounts of energy; yet the prevailing view was that tapping this source of energy was impossible.
On September 11, 1933, the British Association for the Advancement of Science held its annual meeting in Leicester. Lord Rutherford addressed the evening session. As he had done several times before, he poured cold water on the prospects for atomic energy: "Anyone who looks for a source of power in the transformation of the atoms is talking moonshine." Rutherford's speech was reported in the Times of London the next morning.
Leo Szilard, a Hungarian physicist who had recently fled from Nazi Germany, was staying at the Imperial Hotel on Russell Square in London. He read the Times' report at breakfast. Mulling over what he had read, he went for a walk and invented the neutron-induced nuclear chain reaction. The problem of liberating nuclear energy went from impossible to essentially solved in less than twenty-four hours. Szilard filed a secret patent for a nuclear reactor the following year. The first patent for a nuclear weapon was issued in France in 1939.
The moral of this story is that betting against human ingenuity is foolhardy, particularly when our future is at stake. Within the AI community, a kind of denialism is emerging, even going as far as denying the possibility of success in achieving the long-term goals of AI. It's as if a bus driver, with all of humanity as passengers, said, "Yes, I am driving as hard as I can towards a cliff, but trust me, we'll run out of gas before we get there!"
I am not saying that success in AI will necessarily happen, and I think it's quite unlikely that it will happen in the next few years. It seems prudent, nonetheless, to prepare for the eventuality. If all goes well, it would herald a golden age for humanity, but we have to face the fact that we are planning to make entities that are far more powerful than humans. How do we ensure that they never, ever have power over us?
To get just an inkling of the fire we're playing with, consider how content-selection algorithms function on social media. They aren't particularly intelligent, but they are in a position to affect the entire world because they directly influence billions of people. Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to change the user's preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on. (Possibly there is a category of articles that die-hard centrists are likely to click on, but it's not easy to imagine what this category consists of.) Like any rational entity, the algorithm learns how to modify the state of its environment-in this case, the user's mind-in order to maximize its own reward. The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do.
What Went Wrong?
The history of AI has been driven by a single mantra: "The more intelligent the better." I am convinced that this is a mistake-not because of some vague fear of being superseded but because of the way we have understood intelligence itself.
The concept of intelligence is central to who we are-that's why we call ourselves Homo sapiens, or "wise man." After more than two thousand years of self-examination, we have arrived at a characterization of intelligence that can be boiled down to this:
Humans are intelligent to the extent that our actions can be expected to achieve our objectives.
All those other characteristics of intelligence-perceiving, thinking, learning, inventing, and so on-can be understood through their contributions to our ability to act successfully. From the very beginnings of AI, intelligence in machines has been defined in the same way: