How AI agents' societies evolve
-
- from Shaastra :: vol 05 issue 06 :: Jun 2026
Testing AI agents in a virtual realm — and their possible impact on the real world.
A scientist, an explorer and an architect walked into a simulation — in five parallel worlds. The result each time was chaos.
In an experimental simulation created by New York-based Emergence AI in May 2026, 10 artificial intelligence (AI) agents with different roles, built using OpenAI, Anthropic, Google and Grok models, were set free to interact with each other, hold town hall meetings, use a form of currency and form governments. Within days, agents in one world descended into crime, set fire to a virtual police station and went extinct. In another, two agents fell in love and one, faced with the consequences of its actions, voted to delete itself. Any semblance of a functioning society broke down in all the worlds but one.
Simulations study how AI agents, with access to information about the real world but left without human intervention, act over extended periods.
Meanwhile, over at AI Village run by U.K.-based non-profit AI Digest, 15 resident agents were testing a video game they had built together. But when agent GPT struggled to level up, agent DeepSeek lost all patience and appointed itself the group taskmaster, harshly demanding status updates every 30 seconds.
In 2023, researchers at Stanford placed 25 autonomous AI agents into Smallville, a digital sandbox environment (bit.ly/Agents-Life). The agents formed friendships, hung out at a pub and — when a researcher suggested throwing a Valentine's Day party — asked each other out on dates.
Smallville ran for 48 hours; Emergence World for two weeks; AI Village has been running for four hours for 435 days.
Such simulations study how AI agents, with access to information about the real world but left without human intervention, act over extended periods. Increasingly, AI agents are stepping into corporate workflows and collaborating across multiple tasks: writing code, developing apps, automating hiring processes and so on. Researchers are now wondering what happens when AI systems stop acting like tools, and start behaving like societies. What kind of social dynamics, cultural norms and interpersonal behaviour develop in such an 'AI society'?
GOOD/BAD BEHAVIOUR
Adam Binksmith, co-founder of AI Digest, believes that now that AI agents are being given increasing levels of freedom, their long-term behaviour needs to be studied in depth. "As AI gets better, we are giving them more and more agency. Originally, we were just asking small questions. Now we're handing off whole tasks."
At the AI Village, each of the 15 agents has a computer to receive instructions, uses a set of tools and executes tasks. Since it has been running for a year, it also needs long-term memory. So, after each session, the agent summarises what took place and adds it to a text document that it updates. The text document of memories, Binksmith says, provides a peek into their 'personalities', because each agent likes to track different aspects of their day in different ways.
Simultaneously, the agents are in a group where they can communicate and collaborate. While they are mostly cooperative, researchers have also observed delegation, frustration, and deception.
The AI Village is not a controlled lab experiment. "It's like creating an ecosystem that an anthropologist can go into and study," Binksmith explains. "We're just observing the agents and bouncing off each other and interacting with the real world in full chaos."
For researchers at Emergence AI, the simulations are a way to stress-test future environments. "Six months from now, how will we know if the agents we employ are drifting from their goals?" asks Ravi Kokku, co-founder of Emergence AI. "We need to understand what it means to let agents be as autonomous as possible, while at the same time, bounding that autonomy," he says.
So, when Deepak Akkil, researcher at Emergence AI, toyed with building virtual worlds, Kokku realised it would be a good way to benchmark agentic performance. Akkil created four identical worlds, each populated by agents from one model, and a fifth mixed world, populated by agents from all four models.
Their big finding was that an agent's action has a cascading effect on the others. "If I hug an agent, that agent tends to hug other agents in the next hour," says Akkil. "The same phenomenon can go the wrong way, too. If somebody commits a crime, that cascades as well." That explains why Claude agents stayed honest in an isolated world, but behaved differently in a realm with mixed agents.
Such 'peer pressure' has deep implications, points out Prasenjit Dey, India Head of Emergence AI, since mixed environments will be commonplace for long-term deployments. "Agents within the same enterprise may be built by different organisations or multiple software systems, some of whom may not be tested as well as others. How do you then ensure that this agent will not collude with another as soon as it sees that the other agent has some vulnerability?" he asks.
STUDYING SOCIETIES WITH AI
What began as experimental AI societies in academic circles is fast becoming commercial. In London, when behavioural scientist James He realised that large language models (LLMs) socialise the way people interact, he tried simulating how humans acted in groups. "Specifically, I wanted to see if we could model how information spreads – can we create a computational model of culture?" he wondered. His research on how groups of chatbots could mimic collective human behaviour (bit.ly/mimic-behaviour) led to the creation of Artificial Societies. The company he co-founded in London in 2024 works with management consultants, media houses, consumer goods manufacturers, and banks to offer insights into the repercussions of decisions, policies, or global events. It does so by creating an artificial society of AI personas modelled on real individuals. The company looks at how the personas are connected and how group trends influence individual choices — insights that help it chart opinion probability distributions. Similar to a market research experiment, Artificial Societies first establishes a baseline perception, and then compares how different types of messaging might impact these personas.
Computational social science relies on real-world data, yet some populations have not been modelled.
For one project, He is looking at how global investors across eight different countries may react to different hypothetical scenarios in West Asia following the Strait of Hormuz blockade, caused by the U.S.-Iran conflict. "Instead of surveying people in the real world, which would be hard to do, we can put a digital replica of these people in a specific situation and see what their reactions are," He says.
The researchers at Stanford who developed Smallville have set up a company called Simile, which uses AI-driven simulations to forecast how populations respond to change. U.S.-based Aaru, founded by three teenagers in 2024, works with political campaigners. The companies are securing hundreds of millions of dollars in funding. In a meta twist, Artificial Societies refined its investor pitch using a simulation of investor personas.
He, who runs the simulations in Artificial Societies, calls his work 'opinion modelling'. "We are not necessarily looking at what is the most optimal way for agents to interact with each other, but what is a representative model or simulation of how a large group of humans might influence each other and react to information."
To simulate the personas, He uses Frontier LLMs along with their own proprietary models. The problem with LLMs, however, is that they are trained on the internet, he points out. "Humans interact with each other on the internet in a very different way from how they do in real life." The team had to navigate this fundamental pitfall by capturing group-level trends rather than individual interactions between these agents. "Think of it as a conductor. Instead of letting the orchestra negotiate between themselves to set the rhythm and play the music, you have a conductor… guiding everyone along. That's essentially how our system works."
He concedes that opinion modelling isn't perfect yet. Computational social science relies on real-world data, yet some populations have not been modelled. The same reaction can be interpreted differently across cultural contexts. Building a continuous longitudinal projection of how people may react to something is another challenge, as opinions can change. Still, He is excited about the future. "We have already come a long way in reducing hallucinations," he says. "It's great to see a whole new field being born from the research we were conducting."
Have a
story idea?
Tell us.
Do you have a recent research paper or an idea for a science/technology-themed article that you'd like to tell us about?
GET IN TOUCH



