My field of work is artificial intelligence, and since I started I’ve been asking myself how we can create truly intelligent systems. Part of my brain is always thinking about the next roadblock that we’re going to run into. Why are the things we understand how to do so far going to break when we put them in the real world? What’s the nature of the breakage? What can we do to avoid that? How can we then create the next generation of systems that will do better? Also, what happens if we succeed?
Back in ’94 it was a fairly balanced picture that the benefits for mankind could be enormous, but there were potential drawbacks. Even then it was obvious that success in AI might mean massive unemployment. And then there was this question of control. If you build systems that are smarter than you, it’s obvious that there’s an issue of control. You only have to imagine that you’re a gorilla and then ask: Should my ancestors have produced humans?
From the gorilla’s point of view that probably wasn’t such a great idea. In ’94 I would say that I didn’t have a good understanding of why exactly we would fail to achieve control over AI systems. People draw analogies, for example, between gorillas and humans or humans and superior alien civilizations, but those analogies are not exact because gorillas didn’t consciously design humans and we wouldn’t be consciously designing alien civilizations.
What is the nature of the problem and can we solve it? I would like to be able to solve it. The alternative to solving the control problem is to either put the brakes on AI or prevent the development of certain types of systems altogether if we don’t know how to control them. That would be extremely difficult because there’s this huge pressure. We all want more intelligent systems; they have huge economic value.
Bill Gates said that solving machine-learning problems would be worth ten Microsofts. At that time, that would have come out to about $4 trillion, which is a decent incentive for people to move technology forward. How can we make AI more capable, and if we do, what can we do to make sure that the outcome is beneficial? Those are the questions that I ask myself.
Another question I ask is: Why do my colleagues not ask themselves this question? Is it just inertia? That a typical engineer or computer scientist is in a rut? Or are they on the rail of moving technology forward and they don’t think about where that railway is heading or whether they should turn off or slow down? Or am I just wrong? Is there some mistake in my thinking that has led me to the conclusion that the control problem is serious and difficult? I’m always asking myself if I'm making a mistake.
I go through the arguments that people make for not paying any attention to this issue and none of them hold water. They fail in such straightforward ways that it seems like the arguments are coming from a defensive reaction, not from taking the question seriously and thinking hard about it but not wanting to consider it at all. Obviously, it’s a threat. We can look back at the history of nuclear physics, where very famous nuclear physicists were simply in denial about the possibility that nuclear physics could lead to nuclear weapons.
The idea of a nuclear weapon was around since at least 1914 when H.G. Wells wrote The World Set Free, which included what he called atomic bombs. He didn’t quite get the physics right. He imagined bombs that would explode for weeks on end. They would liberate an enormous amount of energy—not all at once, but over a long period; they would lay waste gradually to a whole city. The principle was there. There were famous physicists like Frederick Soddy who understood the risk and agitated to think about it ahead of time, but then there were other physicists like Ernst Rutherford who simply denied that it was possible that this could ever happen. He denied it was possible up until the night before Leó Szilárd invented the nuclear chain reaction. The official establishment physics position was that it could never happen, and it went from never to sixteen hours.
I don’t think the same thing could happen with AI because we need more than one breakthrough. Arguably, Szilárd’s breakthrough—figuring out that you could make a chain reaction with neutrons, which don't get repelled from the nucleus in the same way that protons do—was the key breakthrough, but it still took a few more years, five or six, before a chain reaction was demonstrated.
Five to six years is an incredibly short time. If we had five or six years to the point where there were superintelligent AI systems out there, we wouldn’t have a solution for the control problem, and we might see negative consequences. If we were lucky, they would be contained, and that would be an object lesson in why not to do it. Sort of like Chernobyl was an object lesson in why it’s important to think about containment of nuclear reactions.
I can’t claim to have thought too much about containment and control early on. My first AI project was a chess program in 1975, in high school. I read a lot of science fiction growing up, and I’d seen 2001 and a lot of Star Trek episodes. The idea of machine intelligence getting out of control had been around for donkey’s years in popular culture.
I knew about all that, but I was a pretty straightforward techno-optimist in my youth. To me the challenge of creating intelligence was just fascinating and irresistible.
I studied computer science in high school. Being very interested in machine learning, I wrote a self-teaching tictactoe program and then a chess program. I read some AI books, but at the time I didn’t think that AI was a serious academic discipline.
I wanted to be a physicist, so I studied physics as an undergrad. Then I learned that there was a possibility that you could do a computer science PhD and study artificial intelligence, so I applied to PhD programs in computer science in the US, and I also applied to physics PhD programs in the UK, to Oxford and Cambridge.
For various reasons, I decided to take a break from physics. I had spoken to physics graduate students, post docs, and professors and didn’t get a very optimistic picture of what it was like to do particle theory. You would spend a decade creeping up an author list of 290 people and, if you were lucky, after umpteen years of being a post doc, you might get a faculty position, but you might end up being a taxi driver instead.
I graduated in ’82, and there wasn’t that much going on. It was just before string theory became popular. People were looking for grand unified theories of physics, not finding anything very promising or even testable. I remember very clearly a conversation I had with Chris Llewellyn Smith, who was on the faculty—this was shortly before he went on to be director of CERN—and I asked him what he was working on. Of the people that I had met and taken classes from at Oxford, he was the brightest, most engaging, intelligent man. He said he was working on taking all the grand unified theories then in existence, of which there were eighty-one, and converting them into mathematical logic. Having studied a little bit of AI, I knew about this. In mathematical logic it would be possible to directly compare two theories to tell if they were equivalent to each other or different and whether they had testable consequences. That was a relatively new idea for physics to do that, not just by arguing but by providing mathematical proof.
He got through sixty-four of the eighty-one theories, and it turned out that there were only three distinct theories, so all these people were producing theories not even realizing they were the same theory as everybody else’s.
Two of the three theories were, in principle, untestable, meaning they had no observable consequences on the universe at all; the third one could be tested, but it would take 1031 years to see any observable consequence of the theory. That was a pretty depressing conversation for me, which probably tipped the balance—well, that and the mood of the grad students and the post docs!—towards going into computer science and going to California.
When I arrived at Stanford, I had hardly met any computer scientists. I had worked for IBM for a year between high school and college. They had a few very good computer scientists where I worked. I did some interesting things there, so that gave me more of a sense of what computer science was like as an intellectual discipline.
I had met Alan Bundy at Edinburgh, because I also was admitted to the PhD program at Edinburgh, which was the best AI program in the UK. By and large, people advised me that if I got into Stanford or MIT I should go to one of them. I got into Stanford despite applying six weeks after the deadline. They were very kind to consider my application anyway.
When I got there, my first advisor was Doug Lenat. Several members of faculty had given their spiel at the beginning of semester saying, "Here’s what I work on. I’d love to have PhD students join the group." Doug was just incredibly upbeat and optimistic. He was working on cool problems. He described his Eurisko system, which was a multilayered machine-learning system that was intended to be able to grow into an arbitrarily intelligent system.
Doug was very ambitious, and I liked that. I worked with Doug for a while. Unfortunately, he didn't get tenure. His ideas were maybe a little too ambitious for most of the academic community, and some people did not see enough rigor or clarity in his papers and experiments.
I then worked with Mike Genesereth, who was much more mathematically rigorous. Every paper should have theorems. He wanted to build a very solid set of capabilities and concepts, but he still had the grand ambition of creating truly intelligent systems. He was also interested in creating useful technologies for machine diagnosis, automated design—things like that.
I interacted some with Zohar Manna, who’s more of a computational logician interested in verification and synthesis using formal logic. I absorbed a lot of interesting ideas from him; although, he wasn’t particularly interested in AI so we didn’t quite match. There was a time when I was trying to have Doug Lenat and Zohar Manna as my two thesis advisors, but they just didn’t see eye to eye at all, so it didn’t work out.
I went to Stanford in ’82. Feigenbaum was there; Nils Nilsson was at SRI down the road. Minsky had somewhat dropped out of sight. He wasn’t publishing actively. He had done the frames paper in ’76, which had influence. Stanford, as many universities do, had their brand of AI, and they didn’t really go to great pains to introduce students to everyone else’s brand of AI.
At Stanford there was the Heuristic Programming Project, which Ed Feigenbaum ran, which was very much about expert systems. Mike Genesereth was part of that, but he had a more logic-based approach. Probability was viewed as not particularly relevant. There were arguments as to why you couldn’t use probability to build expert systems.
It started to creep in largely through Eric Horvitz and David Heckerman, who were graduate students in the medical AI program with Ted Shortliffe. They had read about Judea Pearl’s work on Bayesian networks, or belief nets, as they were called then. I began to understand at that time how important that work was. When Pearl’s book came out in ’88, I was pretty convinced that what I had been told about probability was wrong; it was entirely feasible to use probability, in fact, it worked a lot better than the rule-based approach to uncertainty, which the Stanford group had been pushing.
My thesis research was in the area of machine learning, but applying the tools of logic to understand what was going on in a learning system and, in particular, how a learning system could use what it already knew to learn better from new experience. That problem was crucial and still is today, because when humans learn, they bring to bear everything they know already to help them understand the new information.
Humans learn very quickly from often one or two examples of some phenomenon or a new type of object or experience that they might have, and current learning systems might need tens of thousands or millions of examples. Current learning systems are designed to learn with little to no prior knowledge, which is good if you know nothing, but that only explains possibly the first five minutes of a human’s life. After that, the human already knows something and is already using what they know to learn the next thing.
Tabula rasa learning is a good thing to study, but it can’t be a good explanation for intelligence unless you can show that you just start with this blank slate, keep feeding it experience, and it becomes superintelligent. We’re not anywhere close to that right now.
If you think about what’s going on with current learning systems—I know this is a digression—we’re teaching them to learn to recognize a sheep or an Oldsmobile. These are discrete logical categories, and we’re doing that because it’s useful for us to have sheep recognizers or Oldsmobile recognizers or whatever it might be, but if that’s going to be part of a larger scale intelligent system, and if you’re a deep-learning disciple, you don’t believe that deep-learning networks use discrete logical categories or definite knowledge that sheep have four legs, or any of those things. Why do you think that training a sheep recognizer is a step towards general purpose intelligence unless general purpose intelligence really does operate with discrete logical categories, which, at least introspectively, we seem to?
The first thing I did that, if you like, was considered to be a big deal outside of one branch of the machine-learning community was the work on bounded rationality. Intelligence is, in my view, the ability to act successfully. The ability to think correctly or learn quickly has a purpose, which is to enable you to act successfully, to choose actions that are likely to achieve your objectives.
That definition of intelligence has been around in the form of what economists would call rationality, what control theorists would optimal control, what people in operations research would call optimal policies for decision problems. It’s clear that in some sense that’s the right definition for what we want intelligence to be.
In AI the definition of intelligence, if there is one, had been a restricted form of that, which is that you have a logical goal—I want to get to this place, or construct this building, or whatever it might be—that’s the definition of success, and an intelligent agent is one that generates a plan which is guaranteed to achieve that goal. Of course, in the real world there are no guarantees and there are tradeoffs. You don’t want to get to a place if it means dying along the way, for example. Uncertainty and tradeoffs are encompassed in the economic definition of rationality as maximization of expected utility, but it seemed to me that that couldn’t be the basis for AI because it’s not computationally feasible.
If we set up AI as the field that builds utility-maximizing agents, then we’re never going to get anywhere because it’s not feasible. We can’t even maximize utility on a chessboard, and a chessboard is a tiny, discrete, simple, well known and fully observable slice of the real world. We have to operate in the real world, which is vastly bigger. We don’t know what the rules are. We don’t get to see all of the world at once. There are gazillions of other players, so the world is so much more complicated. Starting off with perfect rationality as your objective is just a nonstarter.
I worked on coming up with a method of defining intelligence that would necessarily have a solution, as opposed to being necessarily unsolvable. That was this idea of bounded optimality, which, roughly speaking, says that you have a machine and the machine is finite—it has finite speed and finite memory. That means that there is only a finite set of programs that can run on that machine, and out of that finite set one or some small equivalent class of programs does better than all the others; that’s the program that we should aim for.
That’s what we call the bounded optimal program for that machine and also for some class of environments that you’re intending to work in. We can make progress there because we can start with very restricted types of machines and restricted kinds of environments and solve the problem. We can say, "Here is, for that machine and this environment, the best possible program that takes into account the fact that the machine doesn’t run infinitely fast. It can only do a certain amount of computation before the world changes."
Reading a lot of philosophy when I was younger was very helpful in coming up with these ideas. One of the things that philosophers do is look for places where you’re confused or where there’s some apparent paradox and question how to resolve it. We step back and we say, "Okay, we’re confused or we have a paradox because we have bought into a bunch of assumptions about what problem we’re supposed to be solving. We can’t be doing the right thing if we run into these conceptual roadblocks. So how do you step back and change the definition of the problem?"
What we had been doing was trying to define what a rational action is—the utility-maximizing action—and then saying, "Okay, the objective for AI is to build systems that always choose the rational action." In fact, with a bounded system there is no notion of rational action that makes sense, because you’re trying to ask the question, "What am I supposed to do if it’s impossible for me to calculate what I’m supposed to do?" That question doesn’t have an answer. It doesn’t have an answer because the notion of rational action does not make sense for bounded systems. You can only talk about what the configuration is that I should have so that, on average, I’ll do best when I’m faced with decision problems in the real world.
I was aware of Kahneman and Tversky even as a grad student. Their critique of rationality is an empirical one, that there are all these experiments showing that humans aren’t rational in the classical sense. That doesn’t solve the problem from the point of view of AI. What should AI be aiming for? We’re not aiming to copy humans, and a lot of what humans do is just a consequence of evolutionary accident. There’s no assumption that humans are the pinnacle, that there is no better way to configure a human brain than the way nature has done it.
Nowadays a lot of people are reinterpreting evidence of human irrationality as evidence of human bounded optimality. If you make enough assumptions, you can show that the so-called mistakes that humans make are the consequence of having a program that’s very well designed given the limitations of human hardware. You expect the program to make that type of mistake. So if you are bounded, if you can’t do an infinite amount of computation, then what computation should you do? That led to what we call rational meta-reasoning, which is, roughly speaking, that you do the computations that you expect to improve the quality of your ultimate decision as quickly as possible.
You can apply this to a chess program and use it to control the search that the chess program does. It’s looking ahead in the game, and you can look ahead along billions of different lines in the game, but a human, from what we can tell, is only looking ahead along a few dozen lines, if that. How does a human choose what is worth thinking about? That’s the question. What is worth thinking about?
Obviously, it’s not worth thinking about, "Well, if I make this good move and my opponent does this stupid response..." Why think about that? He’s not likely to make that stupid response, so it’s not worth my time to think about how I would win the game if he did that stupid response. Humans naturally do this. When you learn to play chess, you don’t learn the alpha-beta tree search algorithm—this algorithm for how to allocate your thinking time to various branches of the tree. It’s just natural that our brains know or learn very quickly how to allocate thought to different possibilities so that we very quickly reach good decisions.
I figured out how to do that and showed that you could apply this technique of rational meta-reasoning to control things like game-tree search and get very good results without even designing an algorithm.
I still don’t think that we should think of AI as a collection of algorithms. An algorithm is a highly engineered artifact for a specific problem. We have very highly engineered algorithms for two-player games. When you go to a three-player game, you need a completely new algorithm. The two-player game algorithms don’t work. And when you do a two-player game with chance, like backgammon, you need a completely new algorithm because the two-player algorithm doesn’t work. Humans don’t operate that way. You learn to play chess. You learn to play backgammon. You don’t need some engineer to come along and give you a new algorithm. So it must all flow from some more general process of controlling your deliberations and your computation to get good decisions quickly.
Maybe you want to say it’s one algorithm, but it’s an algorithm that figures out what is the value of the possible computations I could do, and then it does the most valuable one, and that’s the algorithm. That’s it. The same algorithm operates across all these different kinds of games—single-agent search problems, two-player, two-player with chance, multi-player, planning problems. The same principle applies.
Those were the two contributions—understanding this notion of bounded optimality as a formal definition of intelligence that you can work on, and this technique of rational meta-reasoning—that I worked on in the late ‘80s and early ‘90s that I’m most proud of.
Having thought a lot about rationality and intelligence, I then decided I had to write a textbook because I wasn’t seeing these notions clearly laid out in the existing AI textbooks. They were all: "There's this field called natural language processing, so I’ll tell you all about that; there’s a field called search, so I’ll tell you about that; there’s a field called game playing. I’ll tell you about that." There didn't seem to be a unifying thread.
There was no overarching integrating framework, so I wrote the textbook to say it’s all rational agents or bounded rational agents, and the particular methods that are developed in search problems, or game playing, or planning are responses to particular additional assumptions that you make about the environment and how rational decision making occurs under those assumptions. In search problems, we assume that the world is fully observable, that it’s deterministic, and that there's only one agent, and so on and so forth; under all those assumptions, search algorithms make sense, but they’re all just a special case of rational decision making.
I had written some notes from my undergraduate class, up to about 200 pages. They weren’t particularly intended to become a book. It was just that I found what I was lecturing to be departing more and more from what the existing textbooks said.
The AI winter began around ’88. I was exchanging e-mails with the Aspen Institute about this just yesterday. They had used the word “AI winter” to refer to the one that happened in the late ‘60s, early ‘70s. The phrase AI winter came when Hector Levesque, in ’86, wrote a little paper saying the AI winter is coming. The AI winter came from the phrase “nuclear winter” which, as far as I can tell, was coined in ’83 because the National Research Council did a big study on potential effects of a major nuclear war on the climate. That was where the nuclear winter phrase came from. The first AI winter, using that name, was the late ‘80s, and that was after the collapse of the expert system industry.
After that, funding dried up. Students dried up. I was quite worried that the field was going to fail, and part of it was that we were still using textbooks that had been written in the ‘70s or early ‘80s. Pearl’s book came out in ’88, and by that time we had Bayesian networks, which solved a lot of the problems that caused the expert system industry to fail.
There were a number of reasons why the expert system industry failed. They said, "Okay, there’s lots of knowledge work in our economy. It’s expensive. Experts are hard to come by. They retire. They disappear, so there’s a huge economic niche in building knowledge-based expert systems, and the way you build a knowledge-based expert system is you interview the expert. You essentially ask him to describe his reasoning steps, and you write them down as rules and then you build a rule-based expert system, which mimics the expert’s reasoning steps." Unfortunately, I don’t think it works that way.
Everyone appreciated that in many of these problems there is uncertainty—medical diagnosis is the canonical example. Everyone thought of medical diagnosis partly because of the way it’s taught in medical school: If you have these symptoms, then you have this condition, and if you have this condition and this other thing, then you will go on to develop this other condition. The reasoning process was assumed to be from symptoms to conclusions to diagnoses.
They wrote rules in that direction, and of course from any given set of symptoms you can’t conclude definitively that a person has a particular disease, like Alzheimer’s, for example. So there has to be uncertainty involved. There has to be some method of combining evidence to strengthen conclusions, disconfirm evidence, and so on and so forth.
They essentially had to make up a kind of calculus for handling all this uncertainty, which was not probability theory because probability theory doesn’t admit rule-based reasoning steps. In fact, one of the main things Pearl did in his book was explain why chaining of rules cannot capture what probability theory says you should do with evidence.
What tends to happen with those systems is that, with a small number of rules, you can tweak the weights on all the rules so that on the set of cases you want to be able to handle it behaves correctly, but as you get to a larger range of cases and more rules and deeper levels of chaining, you get problems of overcounting or undercounting of evidence. You get problems where you end up concluding, with much higher certainty than you really want, that such and such is true, because the rules essentially operate a pumping cycle where they gain more and more certainty because of their own cycles in the reasoning process.
What happened in practice is that as companies built larger expert systems they found them more and more difficult to get right and to maintain. There were other reasons as well, like you had to buy a Symbolics Lisp Machine to run these packages. You couldn’t integrate it with your other data processing hardware and software. You had to hire special AI programmers who knew how to program in Lisp. There were many reasons, but the main one was that the technology was flawed technically.
Another interesting question is: Is human knowledge in the form that people thought it was? These rules that you chain forward from the evidence with adding uncertainty as you go along? It turns out to be quite difficult to interview people and get those rules out. Rather than asking, "If you see symptoms A, B, and C, what disease do you conclude with what certainty?" you instead ask, "if a person has this disease, what symptoms do you expect to see?" that’s in the causal direction, and this is how an expert understands health and disease. They think in terms of "This microorganism lodges in your gut and causes this to happen and this to happen, and that’s why we see bleeding through the eyeballs," or whatever it might be.
When Horvitz and Heckerman were interviewing experts, they found that they could extract these kinds of causal conditional probabilities in the direction from disease to symptoms very quickly, that it was very natural for the expert to estimate these, and also those probabilities turn out to be very robust. Think about this way, if a person has meningitis, there’s a causal process that leads them to have certain symptoms, and that causal process is independent of who else has meningitis. It’s independent of the size of the population of patients, et cetera.
But look at it the other way around: Meningitis gives you a stiff neck. Well, if someone has a stiff neck, what’s the probability that they have meningitis? That depends. Is there a meningitis epidemic going on? Why is this patient in my office in the first place? Were they feeling just really poorly, or did they have a stiff neck because they got into a car crash?
The probabilities that you can assess in the causal direction turn out to be much more robust. They are valid in a much wider range of circumstances than the probabilities in the diagnostic direction, because whether that probability is valid—whether you have meningitis given that you have a stiff neck—is highly dependent on other circumstances outside of the individual person, and so all those issues conspired to make the expert system industry fail, and it failed pretty quickly.
What happens in these things is that technology comes out—and it’s happening now with deep learning—and everyone says, “If I don’t get hold of this technology and build up a group within my company that knows how to do it and knows how to use it, then I’m going to be left behind.” So they start investing in the technology without any evidence that it works to solve their problem, just on the assumption that if they don’t they’ll be left behind. There’s a potential gain here; we can’t afford to lose it.
So they’re all sitting there, and maybe after six months or a year they’re still waiting for any return on their investment whatsoever, and then they start to hear stories about this other company who tried six times and it just failed—it didn’t work for their problem and it doesn’t work for this or that. So then they start to lose faith very quickly.
All those companies that haven’t yet had a success with the technology can switch overnight from thinking, “It’s essential to maintain our competitive edge,” to “we better get out of this; otherwise, we’re going to look foolish.” That’s what happened with the expert system technology in the late ‘80s, which is a shame because by then we already had the technological solutions that would have alleviated a lot of those difficulties.
I remember going to a dinner in ’93 with a bunch of Wall Street people, and I was the odd person out. I was explaining that I worked on artificial intelligence, and it was like I was working on cold fusion: “AI failed. Right? It doesn’t exist anymore,” as if somehow AI and rule-based expert systems in the commercial marketplace are the same thing. In the mind of Wall Street and the investor community, it doesn’t exist. Their view was, “Forget it,” where, of course, in the academic field we’re still pushing ahead.
As I mentioned, I wrote a textbook that came out in late ’94 that tried to incorporate as much as we knew about how to use the rational agent framework and then a lot of probability decision theory. Peter Norvig is my co-author. I had some course notes, and Peter was back in Berkeley from time to time. He was working at SUN in Boston. They had a research lab there.
We had discussed when he was at Berkeley with Robert Wilensky, who had been Peter’s advisor, that we might try to write a Berkeley AI book. Robert Wilensky had his view on AI—he was a student of Roger Schank's—and after a while, though I really enjoyed Robert’s company—unfortunately, he passed on a few years ago—we just couldn’t see eye to eye on the content. We just thought about the field in such a fundamentally different way that there was no way to get that to work. He had a strong personality, so there was too much conflict.
Peter is an incredibly easygoing person, and that’s one of the reasons why he’s so successful. He does not try to project his ego or his ability at all; he’s completely reasonable. You can have a discussion with him, and he doesn’t feel threatened. He doesn’t try to threaten you, so it’s very productive. He’s also a good writer and a great programmer, so we spent a lot of time writing code together to go with the book. Another important thing we did with the book was to try to build a suite of code that was as fully integrated as possible to reflect the principle of the book, that the rational agent framework was general. We succeeded to some extent. It was not perfect, but it was a step forward.
The book helped to bring what Pearl had done, for example, and some other ideas, into focus. Reinforcement learning also came out in the late ‘80s, but it was not widely known or taught. The whole notion of a Markov decision process was unknown to most AI researchers. We tried to build these bridges showing that AI was continuous with statistics and with operations research, which studies Markov decision processes. So decision making under uncertainty is what they do.
In economics, studying utility theory, how do you construct these functions that describe value? We tried to bring all of that in and create these connections to other fields, and that was helpful for the field. It helped the field to grow up a bit to realize that there’s more to doing research than reading one paper in last year’s proceedings and then doing some variation on that. There’s a lot of literature in all of these other fields that are relevant to some of the problems we care about. So that was ’94, and the book was unreasonably successful.
It’s an interesting story. The issue of risk from intelligence systems goes back to the pre history of AI. The word “robot” came from a Czech play in 1920, and in that play machine servants rise up and take over the world. There never has been a time when existential risk hasn’t been a thread in the field. Turing talked about it—I don’t whether you could call it resignation or just blasé—when he said, "At some stage therefore, we should have to expect the machines to take control."
The “intelligence explosion” phrase came from I.J. Good, from a paper he wrote in ’65 pointing out, in particular, that sufficiently intelligent systems could do their own AI research and hardware design and produce their next generation very quickly, and that process would then accelerate and the human race would be left far behind. People look at that and they say, “That sounds right,” and then they just go back to work as if the actual semantic content was irrelevant.
Minsky pointed out that if you ask a sufficiently intelligent machine to calculate as many digits of pi as possible, which sounds very innocuous, in order to do that, it will take over the entire planet or even the reachable physical universe and convert it into a machine for calculating more digits of pi. This was the point.
Norbert Wiener wrote a paper in 1960. He had seen Arthur Samuel’s checker-playing program, which learned to play checkers by playing itself becoming more competent at checkers than Samuel was. That was very early proof in ’57, ’58 that one of the objections to AI—that a machine can never do any more than we program it to do—was just completely misconceived; machines can surpass their programmers if they’re able to learn.
Wiener was at a stage in his life where he was thinking a lot about the impact of technology on humanity and whether this was going to be a successful long-term future if we follow along the current path, and he used The Sorcerer’s Apprentice as an example. “If you put a purpose into a machine, you better be absolutely sure that the purpose is the one that you really desire.” He said it was going to be a major problem for—he was calling it automation at that time, but we would call it intelligent systems or AI. It's an incredibly difficult thing to even imagine how things are going to play out over a long future, but if we don’t get it right now, we may have a long future that’s not good. We just have to try our best to figure it out. So I find that paper quite inspiring.
I would say, generally speaking, the companies that are doing AI prefer not to talk about risk because they don’t want their brand associated with Terminator robots, which is usually how the media portrays risk—a picture of a Terminator. That’s of course completely misleading. In the media and Hollywood, the risk always involves machines becoming spontaneously conscious and evil, and they hate us, and then all hell breaks loose.
That’s not the risk. The risk is just machines that are extremely competent that are given objectives where the solution to those objectives turns out to be something that we’re not happy with, which is the King Midas story or The Sorcerer’s Apprentice story all over again, and this is exactly what Wiener was warning against.
In Britain there’s this notion of the health and safety officer who’s appointed by the council, a busybody who goes into everyone’s office and says, “Oh, you need to have those windows locked,” or “you need to have wider doorways,” or whatever it is, but you just imagine the health and safety officer going back to a million years BC where these poor people have just invented fire so they can keep warm and stop eating raw food, and the health and safety officer says, “Ah, you can’t have any of that. It’s not safe. You’ll catch your hair on fire. You’ll cause global warming. We’ve got to put a stop to this right now.”
A million years ago would be too early to be trying to put constraints on a technology, but with respect to global warming, I would say 100 years ago would have been the right time, or 120 years ago. We had just developed the internal combustion engine and electricity generation and distribution, and we could at that time, before we became completely tied in to fossil fuels, have put a lot of energy and effort into also developing wind power and solar power, knowing that we could not rely on fossil fuels because of the consequences. And we knew. Arrhenius and other scientists had shown that this would be the consequence of burning all these fossil fuels.
Alexander Graham Bell wrote papers about it, but they were ignored. There was no vote. Governments tend to get captured by corporate lobbies and not so much scientists. You might say the scientists invented the internal combustion engine, but they also discovered the possibility of global warming and warned about it. Society tends to take the goodies, but not listen to the down side.
When genetic engineering got going in the ‘70s, most people didn’t know what that even meant. Most people didn’t even know what DNA was. Had it been fully explained, they probably would have gone along with what the scientists decided, which is that 1) we need to enforce fairly stringent safety constraints on these kinds of experiments so we don’t accidentally produce disease organisms that infect people, and 2) we’re not going to allow experiments that modify human genome. And that’s what they did, and they were pretty praiseworthy.
What they did was impressive given that for a long time one of the main purposes of genetic testing was precisely the improvement of the human stock, as they used to call it. Good old-fashioned eugenics, which was mostly born in California and then exported to Germany in the ‘30s. That was one of the main purposes for doing all this research.
For them to say, "We could do it, but we’re not going to because the social consequences are undesirable," I thought that was pretty brave. It would have been interesting to have a real public debate. I believe they did not allow journalists at that meeting, the Asilomar workshop that they held.
The new meeting, which was precipitated by the new capabilities of CRISPR, came to the same conclusion, but it’s a more leaky situation now. There are too many scientists. There are countries where there’s much less of a moral concern about modifying humanity.
It’s always very difficult for a democracy to decide on what the right regulations are for complicated technological issues. How should we regulate nuclear power? How should we regulate medicines? Often the regulation follows some catastrophe and can be poorly designed because it’s in the middle of outrage and fear.
I would much rather that when we think about AI, we think ahead as far as we can and realize that the right thing to do is not to try to hide the risks. I see, for example, the AI100 report, which just came out a couple of weeks ago. Eric Horvitz set up this AI100, a 100-year study on AI at Stanford, and they’re supposed to produce a report every few years. They assemble a panel of distinguished scientists, and they produce this report, which is intended to be a prediction about what kinds of impacts AI will have by 2030 on a typical North American person living in a city. What sorts of technologies will be available and what they impact? They talk about risks, and basically they deny that achieving human level AI is even possible, which to me seems utterly bizarre. If that’s the official position of the AI community, then I think they should all just resign. The report says that there might be risks, but we shouldn’t talk about them, because if we talk about them that might prevent people from doing research on risks in order to prevent them, which just doesn’t make sense.