The most dangerous intelligence

There’s been concern lately about the dangers of artificial intelligence (AI), and famously the concern has been expressed even by AI’s makers and proponents, such as Sam Altman of Open AI. One term of art used when discussing the danger is alignment, as in, Will the interests of AI remain aligned with those of humanity? Or: Will the interests of AI turn out to be aligned with the interests of some humans, at the expense of the well-being of others?

New tools often do serve some people’s interests better than others, and usually the some in question turns out to be the rich. But the concern about AI is not just that it will put whole classes people out of work. There’s fear that it could amount to a kind of apocalypse—that humans will be outsmarted by the new intellectual entities we are unleashing, maybe even before we realize that the entities are coming into their own. Faster than Chat GPT can write a college essay, power plants will be induced to melt down, pathogens will be synthesized and released, and military software will be hacked, or maybe will self-hack.

Is this possible? The idea seems to be that AI could develop intentions of its own, as it acquires general (rather than task-specific) intelligence and becomes a free-ranging, self-directed kind of mind, like the minds you and I have. Is that possible? Altman has described his GPT-4 engine as “an alien intelligence.” The phrase I found myself resorting to, when I played with it not long ago, was “a dead mind.” It can be uncanny how closely its operation resembles human thinking, but there’s something hollow and mechanical about it. The thoughts seem to be being thought by someone who is disembodied, or someone who has never been embodied. It isn’t clear how the one thing needful could be added to this. Among the surprises of AI’s development, however, have been its emergent skills—things it has learned incidentally, on the way to learning how to write paragraphs. Without its creators having set about teaching it to, AI became able to write software code, solve plumbing problems, translate from one human language to another, and construct on the fly what psychologists call “theory of mind,” i.e., mental models of what other minds are thinking. I think what most unnerves me about interacting with Chat GPT is how seamlessly it manages all the things a human takes for granted in a conversation: the AI seems to understand that you and it are different mental entities, who are taking turns expressing yourselves; that when you ask a question, it is supposed to answer, and vice versa; that when you give instructions, it is supposed to carry them out. It acts as though it understands, or even believes, that it may have information you don’t, and vice versa. That’s a very rudimentary kind of self, but it’s not nothing. Five years from now, will AI have a kind of self that approaches that of living human consciousness?

It’s dangerous to bet against technology, especially one that is advancing this fast, but I think I do see a couple of limits, which I’d like to try to articulate.

First, on the matter of possible apocalypses, I’m not sure that any large-language-model artificial intelligence will ever be smarter than the smartest human. In fact I think it’s likely that AIs created from large-language models will always be a little dumber than the smartest human. Language is not the world. It’s a description of the world; that is, it’s a remarkably supple and comprehensive representation of the mental model that humans have developed for understanding what has happened and is happening in the world and for predicting what will happen in it next. Behind the new AIs are neural nets—multidimensional matrices modeled on the interacting layers of neurons in a brain—and as the neural nets grow larger, and are fed on bigger and bigger tranches of human writing, it seems likely that they will approach, at the limit, existing human expertise. But it doesn’t seem clear to me how they could ever exceed that expertise. How could they become more accurate or more precise than the description of the world they are being trained to reproduce? And since the nets need to be trained on very large corpuses of text, those corpuses are likely going to contain a fair amount of mediocrity if not just plain inaccuracy. So a bright, well-informed human—someone with an intuitive sense of what to ignore—will probably always have an edge over an AI, which will necessarily be taking a sort of average of human knowledge. That John Henry edge might get very thin if the AIs are taught how to do second-order fact-checks on themselves. But I think that’s as far as this process could go. I don’t think it’s likely that the kind of training and model-making currently in use will ever lead to an intellectual entity so superior to human intellect as to be qualitatively different. An AI will probably be able to combine more varieties of high-grade expertise than any single human ever could; a knowledge of plumbing and cuneiform don’t often appear together in a single human mind, for example, given the slowness of human learning, and maybe there’s something that a world-class plumber would immediately notice about cuneiform that a straight-and-narrow Assyriologist isn’t likely to see. That kind of synoptic look at human knowledge could be very powerful. But I suspect that the AI’s knowledge of plumbing will not be greater than that of the best human plumbers, and that the same will be true of cuneiform and the best Assyriologists. To be clear: having world-class expertise on tap in any smartphone may indeed disrupt society. I don’t think it will lead to our enslavement or annihilation, though, and I’m not sure how much more disruptive it will be to have that expertise in the form of paragraph-writing bots, rather than just having it in downloadable Wikipedia entries, as we already do. (Altman seems excited by the possibility that people will sign up to be tutored by the AIs, but again, we already live in a world where a person can take online courses inexpensively and download textbooks from copyright-violating sites for free, and I’m not sure we’re living through a second Renaissance. The in-person classroom is an enduring institution because there’s nothing like it for harnessing the social impulses of humans—the wish to belong, the wish to show off, the wish not to lose face before others—in order to focus attention on learning.)

A second limit: unfortunately, we already live in a world populated with billions of dangerous, unpredictable, largely unsupervised intelligences. Humans constantly try to cheat, con, and generally outmaneuver one another. Some are greedy. Some are outright malicious. Many of these bad people are very clever! Or anyway have learned clever tricks from others. And so sometimes you and I are tempted to loan a grand or two to a polite, well-spoken man our age in another country who has an appealing (but not too obviously appealing) profile pic and a really plausible story, and sometimes our credit cards get maxed out by strangers buying athleticwear in states we’ve never been to, and sometimes a malignant narcissist leverages the racist grievances of the petty bourgeoisie to become President of the United States, but humanity is not (immediately or completely) destroyed by any of these frauds. It isn’t clear to me that AIs wielded by bad actors, or even AIs that develop malicious intentionality of their own, would be harder for humans to cope with than the many rogues we already have on our hands. I’m not saying there’s no new danger here. Criminals today are limited in their effectiveness by the fact that most of them aren’t too bright. (If they were bright, they would be able to figure out how to get what they want, which is usually money, without running the risk of imprisonment and shame. Thus the phrase “felony stupid,” i.e., the level of stupid that thinks it’s a bright idea to commit a felony.) If, in the new world, criminals are able to rent intelligence, that could be a problem, but again, I wonder how much more of a problem than we have to live with now, where criminals can copy one another’s scam techniques.

The last limit I can think of is that the AIs aren’t animals like us, with a thinking process powered by drives like lust, hunger, social status anxiety, and longing for connection, and therefore aren’t experiencing the world directly. There seems to be a vague idea that an artificial general intelligence derived from large-language models could be attached post hoc to a mechanical body and thereby brought into the world, but I’m not sure that such a chimera would ever function much like a mind born in a body, always shaped by and sustained in it. It’s not clear to me that in any deep sense a large-language-model-derived intelligence could be attached to a robotic body except in the way that I can be attached to a remote-controlled toy tractor by handing me the remote control. Maybe I’m being mystical and vague myself here, but as I understand it, the genius of the large-language models is that programmers devised the idea of them, and in individual cases, design the schematics (i.e., how many layers of how many pseudoneurons there will be), but leave all the particular connections between the pseudoneurons up to the models themselves, which freely alter the connections as they learn. If you train up an intelligence on language corpuses, and attach it to a robot afterwards, there isn’t going to be the same purity of method—it won’t be spontaneous self-organizing of pseudoneurons all the way down. It’ll just be another kludge, and kludges don’t tend to produce magic. I think it’s unlikely that AIs of this centaur-like sort will experience the world in a way that allows them to discover new truths about it, except under the close supervision and guidance of humans, in particular domains (as has happened with models of protein folding, for example). Also, unless you develop a thinking machine whose unit actions of cognition are motivated by drives—rather than calculated as probabilities, in an effort to emulate a mental model that did arise in minds powered by such drives—I don’t think you’re ever going to produce an artificial mind with intentions of its own. I think it’s got to be love and hunger all the way down, or not at all. Which means that the worst we’ll face is a powerful new tool that might fall into the hands of irresponsible, malignant, or corrupt humans. Which may be quite bad! But, again, is the sort of thing that has happened before.

All of my thoughts on this topic should be taken with a grain of salt, because the last time I programmed a line of code was probably ninth grade, and I haven’t looked under the hood of any of this software. And really no one seems to know what kinds of change AI will bring about. It’s entirely possible that I’m telling myself a pleasant bedtime story here. Lately I do have the feeling that we’re living through an interlude of reprieve, from I’m not sure what (though several possibilities come to mind). Still, my hunch is that any harms we suffer from AI will be caused by the human use of it, and that the harms will not be categorically different from challenges we already face.