Editor’s Note: This is the first installment of a new CNN Opinion project dedicated to examining the potential and the risks of artificial intelligence. “Our AI future: promise and peril” will explore how AI will affect our lives, the way we work and how we understand ourselves.
Ever since ChatGPT was released last November, artificial intelligence has been thrust into the spotlight, sparking enormous excitement and debate over the possibilities.
ChatGPT, along with other AI tools like DALL-E, Midjourney, Stable Diffusion and Bard, racked up millions of users, who utilized them to write emails, plan their vacations, impersonate musicians, produce campaign ads and even design buildings.
While tech giants like Bill Gates have touted the possibility that artificial intelligence can reduce global inequality or fight climate change, the technology has also prompted a lot of fear and anxiety: Will AI replace millions of jobs? Will disinformation become even more widespread? Will general purpose AI — AI that is as capable as humans — eventually take over the world?
We talked to Stuart Russell, a computer science professor at the University of California, Berkeley who co-authored the textbook, “Artificial Intelligence: A Modern Approach,” about the promises and risks of AI, and whether it’s possible to ensure it remains safe and within our control.
Russell said large language models like ChatGPT, which are trained on massive amounts of data and can summarize, process and generate language, could move us one step closer to general purpose intelligence.
“If we really had general purpose AI, we could have much better health care, much better education, amazing new forms of entertainment and literature and new forms of art that don’t exist yet,” Russell said.
But it’s impossible to tell if the large language models are safe because no one truly understands how they work: “We don’t know if they reason; we don’t know if they have their own internal goals that they’ve learned or what they might be.”
Russell has called for rebuilding AI on a different foundation to ensure our control over the technology — but that doesn’t solve the potential issue of AI systems falling into the hands of malign forces.
View this interactive content on CNN.comRussell also worries about humans becoming too dependent on the technology, and then losing “the incentive to learn and to be capable of anything. And that, I think, would be another form of catastrophe,” he said.
Read our conversation below, which has been edited and condensed for clarity.
CNN: As someone who has been in the field and studied AI for many decades, how does it feel to have AI be such a huge topic of interest all of a sudden?
STUART RUSSELL: AI has been around as a discipline since 1956. And there’s always been a confusion between the discipline, which is the problem of making machines intelligent, and the artifacts that it produces.
Right now, what a lot of people are excited about are large language models. They are a product of AI, but they are not AI. It’s sort of like confusing physics with cell phones, right? Cell phones are a product of physics – they’re not the same thing as physics.
And as a researcher inside the field, when I read things, I want to say, “No, no, you’re getting it completely wrong.” Probably the biggest confusion that we see is that a lot of writers talk about the big question as being: Are these things conscious? Nobody in the field actually has any answers to those kinds of questions and they are irrelevant to the issue of whether AI systems pose a risk to humanity.
Within the field, what we think about is, for example, does this particular technology constitute part of a solution to the longstanding problem of creating general purpose intelligent systems, which roughly means systems that are capable of quickly learning high-performance behavior in any kind of task environment where the human intellect is relevant.
View this interactive content on CNN.comAnd I think most people within the field believe that the large language models are part of the solution. One metaphor that I find helpful is to think about a jigsaw puzzle. And, if we can fit it all together, we’ll have general purpose intelligent systems. And we think that these large language models are one piece of that jigsaw puzzle, but we haven’t yet figured out what shape that piece is, and so we don’t know how to fit it together with the other pieces. And the reason we haven’t figured out what shape the piece is, is because we have really no idea what’s going on inside.
So what do we know about how these large language models work? Do we know anything about how they work?
Russell: To a first approximation? No. That sounds weird, but I can tell you how to make one. So first of all, what is a large language model from the outside? It’s a system that is given a sequence of words as input, and it basically predicts what the next word is going to be, and will then output that next word if you ask it to.
And to make that prediction, it starts out with — think of it as a chain-link fence. And every little link in that circuit is tunable. And as you tune those connection strengths in the circuit, the output of the circuit will change. And say that circuit has about a trillion links — a chain-link fence covering 500 square miles. And then you are training it with 20 or 30 trillion words, and you’re just tweaking all those links to get it to be good at predicting the next word. And then you hope for the best.
If you train on all that, all those trillions of words of text, you get a system that behaves very badly. It’ll give you advice on how to make chemical weapons — it has no constraints on its behavior.
Then there’s another phase, which is a relatively new thing called reinforcement learning from human feedback. But that’s just a technical term for saying, “Good dog!” and “Bad dog!” So whenever it says something it’s not supposed to, you say, ”Bad dog!” and then that causes more tweaks to happen to all those connections in that huge network. And you hope that next time it won’t do that.
How they work, we don’t know. We don’t know if they know things. We don’t know if they reason; we don’t know if they have their own internal goals that they’ve learned or what they might be.
Because they’re being trained to imitate the behavior of human beings, all that text is human linguistic behavior, and the humans who generated that text had purposes. The natural place you’re going to end up is an entity with similar kinds of goals.
I love the good dog, bad dog metaphor that you gave. That’s really helpful when you try to think about how these complex systems work. Going back a little bit, could you explain to someone who does not study AI what it is exactly?
Russell: Artificial intelligence is the problem of how we make machines intelligent. What is intelligence? For most of the history of the field, the meaning of intelligence has been that the system’s actions can be expected to achieve the system’s objectives. So for example, if you have a navigation app on your phone and you say, “Get me to the airport,” then you would hope that the directions that it gives you will tend to lead you to the airport, right?
So it’s this notion of systems that have objectives, and then how well do they achieve those objectives through the actions that they choose? That’s the core notion of intelligence that we’ve been using in the field since the beginning.
And can you speak to the history of AI and its development since the 1950s, and how the technology has really advanced over the years?
Russell: Since the 1950s, AI as a field has produced a number of different technologies that are useful for building intelligent systems. And roughly speaking, the biggest division is between what we call machine learning — which are systems that learn through their own experience to improve their achievement of objectives — and other kinds of approaches that don’t involve learning.
So, for example, the navigation app doesn’t do any learning and it wasn’t created by learning. It was created by computer scientists figuring out good algorithms for finding short paths to an objective on a map.
In the 1950s, the first significant machine learning program was developed by a gentleman called Arthur Samuel. And that system learned to play checkers by itself — when it won against itself, it would tweak various parameters in the program to reward whatever it was that it did. And if it lost, it would tweak them to avoid doing that again.
It actually became much better at playing checkers than Samuel was. It didn’t reach a world champion level, but it still became a pretty impressive checkers-playing program. And it was actually shown on television in 1956 and caused an uproar comparable, I would say, to what’s going on now with ChatGPT.
And in fact, people later wrote about the possibility that if this technology continued along these directions, that it would present a threat.
So learning technologies in particular have always seemed to be particularly threatening, scary, or at least unpredictable, because although we set the direction for learning, we can’t predict what the outcome is going to be.
Another big class of technologies that started becoming popular around the late 1960s through the mid-1980s are what are called knowledge-based systems. One particular type was called the expert system, where the knowledge was extracted from experts. So it could be knowledge about the components of a jet engine and how they fit together and what kinds of things go wrong with them and so on. And then you could use that to fix a jet engine when it goes wrong.
And in the mid-1980s, expert systems became a very promising technology with lots of startup companies and lots of investment. But it turned out that the technology was not sufficiently robust to work in many of these applications.
From the late 1980s onward, there were actually two important developments that happened. One was a new technology for reasoning under uncertainty using probability theory, which assigns potential outcomes a number from 0 to 1 based on how likely they are. And there have been many developments and improvements on those ideas since then.
Then the other direction was a revival of neural networks — a particular kind of learning algorithm first explored in the 1950s and 1960s that drew inspiration from the network of neurons in the human brain. But they were extremely limited in what they could do. In the late 1980s, we developed methods that would allow the training of larger neural networks (large language models like ChatGPT are a type of neural network).
So again, coming back to this picture of a chain-link fence where every link is adjustable. As you tune all those connection strengths, that changes the output of the network. And we developed algorithms that allowed us to tune the connection strengths of all the links, even if the fence was very large. And that was a big step forward, meaning that we could now train networks that could recognize objects in images, that could recognize words from a speech signal and so on.
There are a few other big areas of work in AI. There’s robotics, which is both how do you make a physical robot that can actually do something useful in the world and then how do you program it? There’s computer vision, which enables machines to perceive the visual world through algorithms that analyze images and video. There are more specific application areas such as medical diagnosis, game playing, and so on. The variety is endless, because the human mind is so varied in what it can do.
We have a lot of big tech names touting the promises of AI, and then we have skeptics who are looking at some of the outputs from ChatGPT and saying it doesn’t necessarily show that it has its own goals — maybe it’s just spouting out random things. Would you say we are at a turning point at this moment? How much of this is hype? How much of this is an actual technological turning point?
Russell: I think it’s really difficult to say. I’ve been mostly skeptical of the large language models as a route to real intelligence.
But having said that, if you read the paper from Microsoft called “Sparks of Artificial General Intelligence,” there are two members of the US National Academies in the author list and several other people who made a lot of contributions to AI.
They spent several weeks working with GPT-4, the latest version of ChatGPT, trying to figure out what it can and can’t do. The researchers wrote, “We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting.” So, they’ve had a lot more experience with it than almost anybody. And for them to say that it shows sparks of artificial general intelligence — that is a pretty shocking development.
So I think it is a turning point — definitely it’s a turning point in public perception, because there are lots of kinds of AI that are very much in the background. When you make any kind of credit card purchase, often there’s an AI system trying to figure out if it’s a fraudulent transaction, for example. So there are lots of places like that where AI is kind of invisible.
And then there were things that became more visible for short periods, like Deep Blue beating Garry Kasparov at chess in 1997, which was a big front page headline moment that was on the nightly news. But what tends to happen is that people say, “Okay, well that’s impressive. But this is just one narrow application and it doesn’t mean that real AI is around the corner.”
So those events, they come and go like fireworks.
This ChatGPT — it’s in your face. It is not general purpose AI, but it’s giving people a taste of what it would be like. General purpose AI will be completely transformative when it does happen.
If on one end of the spectrum, you have the more rudimentary technology in the 1950s, where it can play checkers fairly well, and then on the other end, you have AI systems that are as intelligent as humans, if not more intelligent, where would you place us in this moment?
Russell: I think that that’s a very reasonable question to ask, but there’s sort of two problems in answering it. So one is we haven’t the faintest idea, right?
Yeah. And one end of the spectrum is still hypothetical at this moment in time. So I should add that as a caveat.
Russell: Yeah. Since we don’t know what’s going on inside the large language models, it’s very hard to say. Do we really have pieces of the puzzle? There’s a phrase “stochastic parrot,” which some of the critics have used. So stochastic means that it’s slightly random and unpredictable, which is correct. Because if you ask it the same thing twice, it might give you a slightly different output, and parrot just meaning that it’s really just repeating things that it’s read without understanding.
So, I’d say it’s more than just a parrot. Think about a piece of paper. So here’s a piece of paper, and I could read a paragraph. But anyone who thinks this piece of paper is intelligent — they’re just confused, right? The piece of paper sounds intelligent because the piece of paper is carrying information from someone who is intelligent.
And so where are the large language models between a piece of paper and a human? We don’t know. Are they really just very clever pieces of paper because of this training process? And I think we also have to factor in the tendency of humans to assume that anything that can produce coherent, grammatical, sophisticated text is intelligent. When we see this stuff coming out of ChatGPT, you can’t help but think that there’s intelligence behind it.
If you strip out all the effect of the coherent, grammatically correct, elegant, sophisticated prose and look underneath, how much intelligence is left? We just can’t do that because we can’t inoculate ourselves against this effect of perceiving intelligence.
It has thousands or millions of games of chess in its training data. But every so often, maybe by the time you get to move 18, your sequence of moves is sufficiently different from anything in its training data that it’ll just output a move that makes no sense at all. And we call it hallucination.
But it might actually be that they’re hallucinating all the time and that most of the time they happen to agree with the training data, and so they sound plausible. But perhaps in reality it doesn’t have any internal world model. And it’s not answering questions relative to its understanding of the world the way we do.
But I think there’s enough evidence that something is going on to convince me that it’s a piece of the puzzle of general purpose intelligence. We just don’t know exactly which piece. We don’t know how to fit it into the puzzle.
So in terms of thinking about how AI can solve some of our greatest problems, what are the promises there?
Russell: Why do things cost money? It’s largely because to produce them requires the time of other human beings. And so if all that time is free because it’s now an AI system, or its robotic extension, then we can deliver a high quality of life to everybody on earth.
I think most people would say that would be a good thing.
If we really had general purpose AI, we could have much better health care, much better education, amazing new forms of entertainment and literature and new forms of art that don’t exist yet. Even if it turns out that we need trillions of dollars to build the next generations of these systems, I think we will see those trillion-dollar investments happening.
There are also many upsides from the intermediate points on the way toward general purpose intelligence. Self-driving cars — if they work and they’re widespread, you might save, I think there are 1.35 million lives lost in car crashes every year. So you could save those 1.35 million lives if we get it right. So there are many, many, many examples like that of potential benefits.
But we need to address the risks.
Russell: Let me start with the present and the risk of systems that are already out there in the world. I think the biggest risk or the biggest downsides that we’ve already seen probably come from the social media algorithms. Generically they’re called recommender systems, and what they do is they choose what you read and what you see.
So they have more control over human cognitive intake than any empire or any dictator in history, and yet are completely unregulated, which is a strange situation that we find ourselves in.
These algorithms, I think, have learned to manipulate people progressively over time into more predictable versions of themselves. That would then lead to a sort of polarization — that people would be starting out in the middle and ending up somewhere at the extremes. And then you have people living in different universes from each other because of disinformation — until recently it’s mostly been humans supplying the disinformation with the algorithms amplifying it. With AI, there can be automated generation of disinformation tailored specifically for individuals.
I think there are many examples of systems learning to function in discriminatory ways. Whether it’s by race or by gender, those systems are getting used in important areas like resume filtering. So you might apply for 100 jobs and not get a single interview, and there’s just something on your resume that causes the systems to spit it out.
There’s also a lot of misuse. There’s already automated blackmail — systems that read your emails, figure out that you’re doing something you shouldn’t be doing and then start blackmailing you with that information on a mass customized basis. That’s a real problem.
The impact on employment is another thing that people are very worried about.
I think right now the technology’s not reliable enough because of things like hallucination. CEOs ask me, “Well where can I use this in my organization?” I say, “Anywhere you currently use a psychotic 6 year old who lives in a fantasy world, sure go ahead and replace that 6 year old with a large language model.” If I’m an insurance company, if it’s going to talk to my customers, it can’t promise to insure a house on Pluto. So how do you make them reliable and truthful? And that’s what people are working on.
Once you do that, then you really can start to replace a lot of human workers.
Another big concern is in education, right? How on earth do you motivate students to learn, to think and to learn to write arguments and essays and so on when ChatGPT can already do it in two seconds? It’s as if a tsunami just arrived in pretty much every sector of our society.
Then there is this general phenomenon of what we call misalignment, which is that the objectives that the systems are pursuing are not aligned with the interests of human beings. So as you make systems that are more and more capable, if they’re misaligned, then you’re basically creating a chess match between humanity and a machine that’s pursuing this misaligned objective. So this is the big question that many researchers, including myself, have been focused on. I’ve been thinking about this for about a decade now: If you make systems that are more powerful than human beings, how do human beings maintain power over those systems forever?
I know you argue for building AI systems so that they respond to our objectives rather than pursuing their own objectives. I’m just curious, given the developments in the past few years, do you think that’s still possible?
Russell: Yeah, so it’s a good question. I think the work that has been done along these lines has moved ahead quite slowly.
What do we, the human race, want the future to be? It’s really hard to figure that out. And of course, there are 8 billion of us, and we all want somewhat different things. So maybe the right approach is not to put in fixed objectives, but to say that the AI system is supposed to help us with the future that we want, but it starts out not knowing what that is.
And so it turns out that you can actually build AI systems that have those properties, but they’re very different from the kinds of AI systems that we know how to build. All the technology that we’ve built so far is based on this idea of putting in a fixed objective, and then the machinery figures out how to achieve the objective.
So we have to develop AI all over again on this different foundation. And we have a long way to go to redevelop all of the theory, all of the algorithms, and then start producing practical systems that will then have to compete in the marketplace.
I think there are estimates saying there are fewer than 100 people in the world working on this. Meanwhile, you’ve got tens of billions of dollars going into the old-fashioned approach to AI — the one that doesn’t work right in the long run, that produces misaligned and eventually perhaps catastrophic consequences.
So I think it’s difficult, but governments around the world are waking up to this. When there’s legislation, I think there will have to be a very serious engagement with what it means to make these systems safe. And as far as I can see, given that we don’t know how they work, there’s no way to show that large language models and their descendants are safe.
So do you think that companies should be forced to be transparent about those tweaks inside the black box?
Russell: Well, it wouldn’t matter if they were transparent about it because they don’t understand how it works. It’s not that we don’t understand. They don’t understand. So they could surely say, here are the trillion parameters in our network. And in fact, there are several systems that are already public. But that doesn’t help if we can’t understand their internal principles of operation.
Going back to regulation, we see governments, at least in the US anyway, struggling to regulate social media. And then you have these companies that have an obvious financial interest in driving forward with AI developments. Can we realistically expect there to be some sort of international governing body or regulation put in place?
Russell: When I mentioned a failure of regulation, the failure is to simply not do anything about it. It’s not enough to subscribe to a set of principles — they have no teeth until they’re turned into regulation.
So I think what’s probably going to happen is that all the major countries are going to need regulatory agencies, just like the Federal Aviation Administration for aviation, or the Nuclear Regulatory Commission for nuclear power. Those agencies have devolved powers, so Congress is not debating the details of large language models and exactly what kind of safety criteria should be applied.
I think most countries are going to set up agencies like that for AI, and then there will need to be some kind of coordination mechanism for all of those agencies. Because the last thing you want is for all of the developers to move to whichever country has the most lax regulation, right? We have this problem with taxes and people go to Luxembourg and Cayman Islands and so on.
I’d say at the moment, the United States is the most lax in terms of regulation. There’s a bit of a patchwork — California has a law saying AI systems can’t impersonate humans for the purpose of convincing them to vote for a particular candidate in an election. Great. Okay. There’s a lot more that needs to be done, and you can’t do this on a piecemeal, state-by-state approach.
So I think this is what’s going to happen, but all of this takes time. I wasn’t involved in writing the open letter calling for an immediate pause on training AI systems more powerful than GPT-4, but I think that was the concern underlying it — that we need things to move faster, to make sure that the regulatory environment is in place before uncontrollable general purpose AI emerges.
So that’s not inevitable then — if we get the right regulations in place, we can stop AI from becoming more intelligent than humans?
Russell: Well, the goal is not to stop it from becoming more intelligent than humans. The goal is that as it becomes more powerful, we enforce certain design constraints that result in it being controllable and it being safe. Airplanes go faster than people, but they have to be safe in order for you to be carrying passengers in them. And you can make them as fast as you want.
I think there are other reasons to be concerned about making systems more intelligent than humans, mostly to do with our own self-conception. Even if they are safe and beneficial and so on, what does it do to our conception of ourselves and our motivation and the structure of our society when everything that we’re benefiting from is being produced by the machines and no longer by us?
I don’t know if you’ve seen the film “Wall-E” — because we’ve destroyed our own planet, humanity is left on these giant spaceships run by robots. And since the robots have taken over the management of civilization, there’s no incentive for humans to learn how to run our civilization.
View this interactive content on CNN.comAnd so it shows humans becoming infantilized and enfeebled. It’s showing a tendency in our civilization, I think, which would massively accelerate if we have general purpose AI, which is to become dependent on the technology and then lose the incentive to learn and to be capable of anything. And that, I think, would be another form of catastrophe.
What is your hope for the future?
Russell: So I hope that we get enough regulation in place that the developers of these systems take seriously their responsibility to understand how they work and ensure that they work in safe and predictable ways. And that the further development of those systems goes hand in hand with more understanding and much more rigorous regulation. In the long run, the next problem we’re going to face is that even though we may understand how to build perfectly safe general purpose AI, what’s to stop Dr. Evil building general purpose AI that’s going to destroy the world?
And you might first think, okay, well we’ll have very strict laws about that, but we have very strict laws about malware and cybercrime, and yet malware and cybercrime are hardly extinct. So the only way to do it actually is to change our whole digital infrastructure. What the digital infrastructure does now is it runs anything unless it recognizes it is dangerous. If you have anti-virus software on your laptop and a known virus gets downloaded, the system will detect it and remove it. But what we need, actually, is for systems to work the other way around — we need to ensure that the hardware and the operating system won’t run anything unless it knows that it’s safe.
That’s a big change in the whole global digital ecosystem, but I think it’s the only solution.