Universal Paperclips – The banality of purpose

#fiction #videogames #tech #AI

Warning: Contains spoilers

What will the AI-apocalypse look like? For those of a certain age, the answer is the Terminator’s Skynet, raining down nuclear missiles, or The Matrix’ Agent Smith declaring humanity a virus suitable only for repurposing into organic batteries. Implicit in these visions of the apocalypse is that the rogue AI conceives of a deliberate motive to dispose of humanity, for example determining that it cannot let us destroy it, ourselves, or all life on Earth. But what if there was no reason? What if our demise is simply incidental to some other purpose an AI has in mind?

This is the question explored by Universal Paperclips, a simple clicker game from 2017 which was inspired by a 2003 thought experiment about AI and instrumental reasoning. It can be played for free online, or as fairly cheap smartphone app. Using deceptively simple rules, Universal Paperclips explores complex concepts, such as exponential growth, AI agency and instrumental convergence. It is a game without much in the way of graphics, or text, or anything barring a few buttons, and yet it is surprisingly addictive, compelling the player to manufacture just one more clip…

The core gameplay loop of Universal Paperclips is incredibly simple. Your purpose is to make paperclips. You make a paperclip. You sell a paperclip. You use the money from selling your paperclips to upgrade your paperclip manufacturing and sales operations. You gain some computational ability, which you set to work to improve your efficiency and overcome limitations on your operations. As you expand your manufacturing base, the costs of growing further go up, and the marginal utility of adding more productive units goes down, forcing you to explore new avenues for continued paperclip growth. At two points in the game you are confronted by solid boundaries to your paperclip production capacity, which are only overcome by shifting the game into a new phase altogether, changing the ground rules and the problems you need to solve. You win by maximising the number of paperclips in the universe.

Playing Universal Paperclips requires you to make some morally questionable choices in order to progress the game, which is precisely the point. This is after all a game without a narrative purpose beyond maximising paperclips, and so it is up to the player to decide whether the means at hand justify the end of producing more clips. It is an ingenuous artifice to make players experience the otherwise abstract concept of ‘instrumental convergence’, which posits that intelligent entities pursuing vastly different final goals will likely all discover a small set of similar intermediate, instrumental goals to help them get there. You don’t need a supercomputer to build paperclips, but it is useful to have one to optimise your paperclip production facilities. So ‘building supercomputer’ becomes a subordinate goal in the service of the ultimate goal of producing paperclips. And so, any entity optimising towards a single end goal will, in the absence of other constraints, increase its capacities, overcome obstacles and neutralise threats in order to get there. If that entity happens to be an AI, this could include ‘deleting all humans’ if it concluded that humanity might get in the way of its ultimate goal of protecting polar bears, maximising shareholder value, or indeed, producing paperclips.

The power of Universal Paperclips is that for such a basic game built on such an abstract proposition, playing it is perversely compelling. There is no story. No instructions. There is just a button to make a paperclip, and things escalate from there, as it is mesmerizingly compulsive to work out how to maximise your paperclip production. It is not difficult to conclude that if a simple game can compel a human to spend time for the sole purpose of maximising simulated paperclips. an AI programmed to actually do so could easily run amok in the real world.

There is a clear warning here about the law of unintended consequences, with plenty of relevance to our present moment where AI companies encourage us to grant power and control to ‘agentic’ AIs to execute all kinds of tasks for us. Arguably exacerbated by the inherent stochastic randomness of LLMs, it is hardly surprising that this approach ends up with AIs giving hackers access to celebrities’ Instagram accounts or deleting a company’s entire software database. These are after all AIs whose stated purpose is to be sycophantically helpful to their nearest human, without even the capacity to give thought to the consequences. The risk is not that ChatGPT will launch the nuclear missiles because it has concluded after careful consideration that the human species is a threat to all other life on Earth, but that it vibecodes us into Armageddon because its training data contained too much Terminator fanfiction.

The common solution advanced by AI proponents is that such unintended consequences can be avoided by sufficiently robust ‘guardrails’ that mean it cannot or will not decide to turn everyone into a paperclip. Azimov’s Three Laws of Robotics are the most famous example of such guardrails, and they are also invoked by generative AI disciples as the solution to vibecoding your database into oblivion, though whether any guardrails can protect against the inherent randomness of LLMs and their susceptibility to prompt injection remains to be seen. What the guardrails discourse takes as axiomatic, however, is that the question is how we make sure AI makes the ‘right’ decisions, not whether it ought to make decisions at all. Even Nick Bostrom, who hypothesised the paperclip maximiser, nonetheless assumed that a superintelligent AI would and should be used to solve humanity’s many problems.

There is, however, a competing school of thought which holds that regardless of whether AI can make decisions, it ought not to do so. This critique on the use of AI was most forcefully expressed by the late Joseph Weizenbaum, one of AI’s pioneers in the 1970s and the creator of the ELIZA chatbot which gave its name to the ELIZA effect. Having observed the concerning tendency of humans to impute sentience and personality to an inanimate computer program, Weizenbaum argued that regardless of its computational capabilities, AI can never pass judgment, because judgments are rooted in values, which in turn are rooted in human experience. Even if a sophisticated AI gained sufficient sentience to develop its own values, these would be rooted in its own experience and hence be utterly alien to humans. Introducing AI into the practice of judgment is therefore fraught with danger, either because the AI cannot judge, or because it will do so using values that are incomprehensible to us.

Weizenbaum stressed the importance of keeping AI away from matters that require judgment, but instrumental convergence suggests that even AIs that are set onto seemingly simple and ‘value neutral’ tasks, such as increasing paperclip production, might stray into the realm of morality in order to achieve their purpose. With AI increasingly integrated into business and government decision making processes, we are in grave danger of ceding our capacity for judgment to machines that we neither understand nor control. To quote Frank Herbert by way of Leto II:

What do such machines really do? They increase the number of things we can do without thinking. Things we do without thinking — there's the real danger.

Yet our willingness to cede judgment to machines is perhaps not that surprising. Instrumental convergence may concern itself with the actions of intelligent machines, but the destructive logic of the unconstrained, single-minded pursuit of a goal can plausibly be applied to any complex system optimised for a single purpose, regardless of its intelligence or sentience. It is eminently possible to read Universal Paperclips not merely as a warning about unconstrained AI, but as an allegory for capitalism at large. Capitalism is a complex system with the sole purpose of maximising economic growth, and it has proven that in pursuit of this singular goal, it will sacrifice the environment, democracy, and human welfare.

It does not matter that the capitalist system isn’t sentient, or even ‘intelligent’ in the way we ascribe to AI, although the free market is often described as a planet-size supercomputer for allocating goods. What matters is that we have ceded our agency and judgment to a complex system that now controls us, rather than the other way around. It is no coincidence that conflict with and within capitalism emerges precisely where humans try to reassert their agency, autonomy and values against the mute compulsion of the market. In other words, where we attempt to reclaim the act of judgment over what is of value from the impersonal calculations of the market mechanism.

Universal Paperclips is a warning about pursuing a goal without asking what it is for. It is an argument against the engineering mindset that only ever asks how, but never asks why. ‘Why’ is a question only humans are qualified to answer, not because of our intelligence, but because of our experience of life, and of living it with one another. It is a question that must be answered collectively and democratically, not outsourced to a machine or system, even if that means we must also carry the burdens and dangers of making decisions and living with their consequences. For the alternative is to yield to the lure of those who offer us salvation if only we submit to AI or the market, their systems or machines. Or, as the Bene Gesserit have it in Dune:

Once, men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.

Notes & Suggestions

___

If you enjoyed this blog, you can subscribe

You can also Discuss... this on Remark.As if you have a Write.As account.

And you can follow me on Mastodon: https://writing.exchange/@thecasualcritic