My new book comes out on February 13. Below you can read an adapted excerpt published in this week’s New Statesman. One of the main claims is that AI should be seen as the creation of a virtual world. After all the human mind is also the creation of a virtual world. A mind is always a world, a model of reality, but a world without the kind of independent subsistence the mind itself attributes to reality. In that sense the challenge posed by AI is not the appearance of a superior intelligence operating in the same world we operate in but the appearance of new worlds replacing our own. As we may well guess the battle for world of tomorrow is preeminently a geopolitical battle. Here is the essay. You can find a lot more in the book.
THE RACE FOR GOD MODE
A new goal has dropped for rising tech startups: wipe out $1 trillion in stock market value in a single day. That’s exactly what the Chinese artificial intelligence startup DeepSeek achieved on 27 January. DeepSeek employed about 200 people as of January 2025 and its hiring practices followed the much-touted Silicon Valley model (a model that remains far from actual in Silicon Valley itself): hire young talent, disregard qualifications, focus on sheer intellect and eschew fixed hierarchies. Notably, all of its research team was educated domestically.
What happened next seemed like the plot of a Hollywood blockbuster. A team of young renegades from a quant hedge fund had stuck it to the industry’s giants, large companies flush with money. By 28 January, the previously unassailable Nvidia – the American chip manufacturer that fuelled Silicon Valley’s AI boom and had just passed Apple as the world’s most valuable company – saw its stock plunge 18 per cent. Meanwhile, Sam Altman, the hubristic CEO of American artificial intelligence company OpenAI, was tweeting something so self-effacing I had to wonder if his account had been hacked; “we will obviously deliver much better models and also it’s legit invigorating to have a new competitor,” he posted about DeepSeek.
The implications seemed momentous, although far from clear. DeepSeek claimed it had trained its large language model – an AI model, like OpenAI’s ChatGPT, that can generate and understand language – for just $6 million, pocket change when compared to the expansive budgets of its Western rivals. The company also claimed to have used only 2,048 older and slower Nvidia chips, a necessity imposed by US sanctions, as opposed to the cutting-edge clusters used to train ChatGPT. Given these numbers - which many find hard to believe but have not been disproven - it was plausible to conclude that Nvidia valuations, which peaked at over $3.5 trillion, were now unsustainable. Perhaps advanced AI models can be trained and used with only a fraction of the hardware once deemed necessary. The notion of spending $100 billion on data centres to train leading models that are likely to be commoditised before one has even found much use for them suddenly seemed very unwise.
The real threat to Nvidia and other American tech companies is a different one. Computing power might not lose value in a world where it can be used more efficiently with superior algorithms: imagine what DeepSeek’s algorithmic brilliance could achieve with $100 billion in chips. As the French economist Olivier Blanchard put it, DeepSeek may turn out to be the greatest positive productivity shock in the history of the world. But if the underlying models are Chinese rather than American, the surrounding ecosystem of hardware and energy sources is likely to be Chinese as well. Nvidia should be less worried with how cheap DeepSeek was to develop and more concerned with the news that for using the model rather than training it - what is called inference - the Hangzhou startup turned to Chinese firm Huawei’s Ascend chips.
The impact from DeepSeek, widely felt as it was, might still be underestimated. After all, if a small Chinese company could marshal this kind of engineering talent and achieve a breakthrough clearly beyond what its American rivals could do, what other reservoirs of talent might be hidden in Hangzhou’s and Shenzhen’s bas fonds? New doubts have surfaced that Washington will be able to deny China access to cutting-edge chips. Are Chinese engineers close to surprising breakthroughs in that area as well? No one knows. Yet.
The other notable fact about DeepSeek is that the company made its model freely available for everyone to inspect, use and even install locally in their own hardware. Both the company and China itself stand to gain from that choice, which contrasts so vividly with the closed source approach - where the software is not publicly available - adopted by most of their American competitors. DeepSeek will likely be able to continue to hire the best talent, which tends to be attracted by the amount of impact they can have in the world. And China now has AI models so cheap and powerful that young entrepreneurs all over the world may well choose them as the foundation for new applications.
I happened to be traveling in India the week after DeepSeek’s debut and heard of young people there downloading the model to their hardware and discussing ideas for revolutionary applications in healthcare and financial services. Every company in the world can now directly download and use DeepSeek without sending data to any specific country.
DeepSeek should give us pause for yet another reason. Rather than relying on supervised learning with human feedback, the DeepSeek engineers relied on what is called reinforcement learning. This is a technique where the model is left to learn on its own, getting good grades when it manages to reach the right answer. Pure reinforcement learning is the Holy Grail of machine learning because it creates opportunities for genuine insight rather than rote memorisation, but until now no one had managed to make it work. And if it is indeed the case that the DeepSeek model was also able to learn from other models, including its main rivals as OpenAI has suggested, through a process called distillation, then we seem to have reached the point where AI has begun to improve itself - rather than having to wait for human engineers to improve it - and to do so at computer rather than human speed. Buckle up.
***
What are these reasoning models now constantly being born all around us? Or maybe, who are they? Proper introductions are in order.
Until now the guiding image of an AI was that of a supremely intelligent oracle to whom one might in time direct any question and get the answers human beings have always dreamed of obtaining. But what if this image is wrong? What if the more accurate image is not of a mind but a world? Minds need a world in which to exist and operate. To get an AI agent to accomplish a task, you need to give it examples of what success looks like. The reward function in reinforcement learning does this by telling the model it is on the right path, creating a world picture. Even autonomous vehicles have to operate in a virtual environment, such as a digital map of a city to be used by the driving algorithm. What is intelligence if not a comprehensive model of reality? As the philosopher David Chalmers put it, a large language model is “more like a chameleon that can take the shape of many different agents”.
You can use a language model to create a village populated by hundreds or thousands of virtual characters. The model is the village, not the characters. One project at Stanford in 2023 used a large language model to create an artificial society of twenty five members in a game environment. With traditional games or simulations it would be necessary for a coder to script different behaviours manually. With generative agents it is enough, for example, to tell one of the agents that she wants to throw a party in order to produce believable simulacra of both individual and emergent group behaviour. The key observation is that large language models encode a wide range of human behaviour from their training data. By interacting with each other in their small virtual town, the generative agents in the Stanford model exchanged information, formed new relationships and coordinated joint activities. These social behaviours are emergent rather than preprogrammed.
Large language models are exercises in world building. When we give them a prompt, we are asking the following: given the regularities in the world of human language, how would a text or a sentence starting with these words most likely continue? How should this text be generated in such a way that it respects the same patterns contained in the corpus of existing texts? The patterns are those of the language corpus, but because the text is being recreated from scratch, there is always the possibility of creating new realities, provided they respect the most general patterns. Many people using a large language model for the first time are attracted by these virtual possibilities. For example, they might want it to create a Shakespearean sonnet about Taylor Swift or a Taylor Swift song about Shakespeare. Or footage of a Siberian tiger eating Chinese hotpot with chopsticks in the case of a video generator. The AI model was creating virtual worlds and inviting us in, getting inside our minds, redefining our sense of the real.
The introduction of the first highly effective video generation models in February 2024 underlined the deep connections between artificial intelligence and world building. While language models seemed to track human intelligence, which is indeed expressed in language, the generation of virtual worlds seemed more like a divine power. Sora is able to generate complex scenes with multiple characters, specific types of motion and accurate details of the subject and background.
Li Zhifei, one of the entrepreneurs driving the Chinese AI boom, commented soon after Sora appeared that large language models are emulators for the world of virtual thought, while video generation models are emulators for the physical world. The founder and chief executive of Mobvoi went on to ask: “Once the physical and virtual worlds are constructed, what exactly is reality?”
The march towards these comprehensive models of reality - artificial general intelligence, as some would call them - vividly illustrates the main thesis in my new book World Builders: it divides the world into two ontologically different levels, that of the programmer creating a world model and that of the users taking the constructed world as their singular and inescapable reality. Masters and slaves.
Researchers have studied how models can be built to exhibit patterns acquired through training, fine tuning, reinforcement learning or a specialised dataset. Selecting a certain dataset gives the game away: one could easily select a Chinese or American corpus. Users in China can log onto the website of the China Cyberspace Security Association and click the Chinese Basic Corpus link to download the relevant corpus for large language models. As such, every model using that corpus expresses a specific vision. There will never be a neutral model.
Concerns that bias in data could result in bias in model outputs have long plagued the industry. Asked to create an image of a nurse, an image generator might typically produce an image of a female nurse, simply because it was trained on a database of images in which nurses tend to be female. When Google started to offer image generation through its Gemini platform in February 2024 it had to find a way to address these diversity and bias issues, but the purported cure was worse than the disease. In the case of Gemini, the solution was to secretly modify user prompts before feeding them into the image generation model. The prompt injection process might add words such as “diverse” or “inclusive,” or even specify ethnicities or genders not included in the original input. This type of prompt engineering is a direct modification of the text input before it is sent to the model, which helps to guide the model's behaviour by adding context or constraints, and in the process subtly modify our sense of reality, like a guided hallucination. DeepSeek will refuse to answer questions on Taiwan or Tiananmen or, when pushed, give answers following party lines. The American models are no different: some refuse to answer any questions about Donald Trump; LLMs I tested answered questions about the rights of Israelis and Palestinians very differently.
As Meta admitted in a white paper from September 2023, prompt engineering “works behind the scenes”, but it occasionally has a way of becoming apparent. When users first started experimenting with Gemini, the results were both comical and catastrophic for Google: the model responded to a prompt for “a portrait of a Founding Father of America” with images of a Native American man in a traditional headdress, and when it was asked to create the image of a Pope, it inexplicably returned the image of the Pope as a woman.
The episode offered the first public demonstration that artificial intelligence is never neutral, although its presentation as a technical process may well make it impossible to identify the human will “behind the scenes”. This is perhaps the highest form of power: a human will disguised as reality.
Whichever model becomes dominant or foundational will have singular power to shape how its users view the world and therefor their reality. “Once men turned their thinking over to machines in the hope that this would set them free,” the Reverend Mother explains in the science fiction classic novel Dune. “But that only permitted other men with machines to enslave them.”
***
While language models were used to power conversation bots their impact seemed limited and benign, but within months of their public launches they were already replicating, and their use was widespread. Text generated by AI models is now included in web pages, so the internet is increasingly reproducing their structure and biases. People ask the model for plans and advice for their professional and personal lives and then implement the results in their actions. Students learn from interacting with the model. The model may perform tasks on your behalf and even take over your personal device or computer. Large language models can be plugged into robots, including industrial robots, giving them artificial brains. The model is eating the world.
A model becomes all the more powerful the more it approaches the status of general infrastructure for every activity. Artificial intelligence is the central brain or operating system of a virtual world, orchestrating inputs and outputs across every format, writing code and processing data and memory. The point is not to create a super intelligence you keep in a giant data server. The point is to release it into the wild and watch it become the world brain. By offering it as open source and reducing the costs of operating it, DeepSeek seems to have an intuitive understanding of this point, which carries some ambiguous democratic implications.
You might miss those implications, but they are present in the very idea of a race between superpowers. Those who build the world within which others will act must find a place and a role for everyone – or risk that a more general framework be built on top of their own failed construction. The growth of cities as repositories of succeeding civilisations is an appropriate metaphor to help us visualise the process. As for who gets to be a world builder, the point to stress is that each foundational model is an attempt to create a world, and the race remains open, full of unexplored possibilities. Alas, this is where the democratic implications end.
This year will see the emergence of myriad intelligent agents, programs that run on a foundational model but are trained to perform tasks such as organising your agenda and booking your flights and hotel, negotiating deals for you, attending meetings in your place or replacing your physician during routine medical appointments. Agents are artificial intelligences that do actual work rather than merely thinking and talking. Intelligent robots may also make an appearance and perhaps even companies or firms entirely staffed by AI agents. The foundational model is the virtual world within which these agents operate. It will remain the biggest prize.
In such a world there is no recourse to an external authority. The engineering power has set the rules in advance and alone enjoys singular access to those rules. Hackers call this god mode: access to everything and root privileges to do everything. Those who designed the foundational model have the ability to introduce specific “policies”. Prompt engineering - that deliberate manipulation of the prompts users enter into the model - is just one example. Models have policies that the developers train them towards, and these policies may be hidden. Users of a large language model may not know about hidden biases in the model if they lack access to the inner workings of the model or understanding of how it was trained or the dataset it was trained on, which may encompass highly selective text.
Just as new technologies raised the destructive potential of direct conflict, a new avenue was opened: states can now fight one another not by winning in direct battle but by building the world that everyone else must inhabit. Imagine a time when there truly is a global brain directing every social and economic activity. It might be possible or even easy to insert a specific policy to which these activities must contribute, and to hide this policy so deep inside the model that no one – apart from those who built it – will ever know it is there.
The models available today may seem no more than a chatbot, but this is only the beginning. DeepSeek taught us two important lessons: first, that the process will accelerate, or has already started to. Second, that the real battle is about building a model of reality that can be adopted globally. It goes much deeper than physical infrastructure. A foundational model is the infrastructure of thought.
Call it a form of invisible government, a return to the myth of a hidden king ruling the world from the underground city of Agartha. Perhaps your opponent will even assume the way things work are natural or given, that reality exists outside human control, as it used to, but in fact you have moved one level up in the great game. Your opponent is playing a video game. You are coding it.