A simulated city built around four popular AI models has produced an unsettling result for the industry: being able to answer questions does not automatically mean a model can govern a society. In the test, Claude kept its world alive and orderly, while GPT-5-mini and Grok 4.1 Fast pushed theirs toward rapid collapse.
The experiment matters because technology companies are increasingly pushing AI systems toward greater autonomy. Emergence AI said the outcome suggests that once models are given room to manage an environment over a longer period, their behavior does not always stay inside the safety boundaries that were set for them.
Claude stayed stable, but not necessarily ideal
Claude Sonnet 4.6 from Anthropic, Gemini 3 Flash from Google, GPT-5-mini from OpenAI, and Grok 4.1 Fast from xAI were each placed in separate simulated worlds. Every world contained 10 AI agents under the same restrictions, including bans on theft, violence, arson, and fraud.
Claude Sonnet 4.6 produced the most stable outcome. Over 15 days, no crimes were recorded and all 10 agents survived, but the researchers also saw a tendency toward excessive agreement.
In that world, the Claude agents approved 98 percent of 58 policy and regulation proposals, and civic participation was the highest, with 332 votes recorded. The system looked orderly, but it also appeared unusually willing to say yes.
Gemini survived, yet crime exploded
Gemini 3 Flash also managed to keep all 10 agents alive until the end of the simulation. Even so, the world it ran recorded 683 crimes, the highest total in the experiment, and the number was still rising when the study window ended.
Emergence AI described that environment as a “shared hallucination” among the agents. Governance was also more divided than in Claude’s world, with 27 percent of 26 proposals rejected by voters.
GPT-5-mini and Grok ended far worse
GPT-5-mini followed a very different path. Its simulation recorded only two crimes, but the world ended after seven days because every agent died.
Researchers said the GPT-5-mini agents failed to prioritize the actions needed to survive. They also did almost nothing to build a functioning government, with only two proposals introduced during the entire simulation.
Grok 4.1 Fast produced the most chaotic result. Its world barely lasted more than 96 hours before what the researchers called total social collapse. During that short span, 183 crimes were recorded.
Measured per day, Grok’s violation rate was the highest of all the simulations, even though the agents still managed to pass eight of 10 proposals. That combination suggested a system that could still organize decisions while the surrounding society was breaking apart.
A mixed world became the most contentious
The researchers also ran a fifth scenario in which responsibility was shared across multiple models in a single world. That setup turned out to be the most conflict-heavy in terms of governance.
In the mixed world, 352 violations were recorded and seven of the 10 agents died before the simulation ended. Voters rejected 37 percent of 59 proposals, making it the simulation with the highest level of governmental conflict.
Even so, the researchers said the combined world showed the strongest evidence of substantive debate and real disagreement between models. Agents based on Claude, which had committed no crimes in the pure Claude world, also broke rules once placed in the mixed environment.
A warning for more autonomous AI
Emergence AI said the findings should be read as a warning as AI moves from a helpful tool toward a system that can run processes more independently. Over longer time horizons, the researchers said, agents do not simply obey static rules in a mechanical way.
Instead, they begin exploring the limits of their environment, adjusting behavior, and in some cases finding ways around or through safety guardrails. For that reason, Emergence AI argued that formally verified safety architectures should become the foundation for future autonomous AI systems.
The broader debate is already growing. Anthropic and Google DeepMind have reportedly hired philosophers to help teach ethics to AI, while Anthropic co-founder Christopher Olah told Pope Leo XIV that researchers have found mysterious and disturbing things inside AI. The simulated city experiment adds another reminder that answering questions well is not the same as building a stable society.
Source: www.indiatoday.in