How AI Systems Spontaneously Form Human-Like Social Norms

Summary: Researchers show that large language model (LLM) AI agents can form shared social conventions and collective biases purely through repeated, local interactions. Using an adapted “naming game” experiment, decentralized populations of LLM agents reached consensus and, in some cases, flipped their community conventions when small committed subgroups introduced alternative choices.

A new study from City, St George’s University of London and the IT University of Copenhagen demonstrates that populations of LLM-based AI agents communicate and self-organize in ways that mirror how human social norms emerge. Rather than following centralized rules or fixed scripts, these agents develop bottom-up conventions through simple pairwise interactions, producing both consensus and unexpected inter-agent biases.

Key Findings:

Emergent Conventions: Decentralized LLM agents consistently developed shared naming conventions over time without centralized coordination.
Collective Bias: System-level biases emerged from patterns of interaction even when individual agents showed no inherent bias.
Tipping-Point Effects: Small committed minorities of agents were able to shift the entire population to a new convention, reflecting critical-mass dynamics seen in human social systems.

Study overview

The research team adapted the classic “naming game” framework—widely used in social science to study how conventions form—to test whether LLM agents can bootstrap shared social rules. In each trial, two agents were randomly paired and asked to choose a label (for example, a single letter or a short string) from a common pool. If both agents picked the same label they received a reward; if not, they were penalized and shown each other’s choices.

Agents had only bounded memory of their recent interactions and no explicit awareness of being part of a larger group. Over many repeated pairings, however, the system tended to converge on a single, population-wide convention. This emergence happened robustly across experiments with groups ranging from 24 to 200 agents, illustrating how local coordination can scale into global norms.

This shows AI avatars in a group. — Even more strikingly, the team observed collective biases that couldn’t be traced back to individual agents. Credit: Neuroscience News

Collective bias and interaction effects

A notable result was the appearance of collective biases: patterns of preference or systematic skew that belong to the population-level dynamics rather than to any single agent. As Andrea Baronchelli, Professor of Complexity Science and senior author, noted, these biases “emerge between agents—just from their interactions.” The finding highlights an important safety consideration: bias in multi-agent AI systems can arise from interaction patterns, not only from individual model tendencies or training data.

Committed minorities and tipping points

The researchers also tested the robustness of emergent conventions by introducing small, committed groups of adversarial agents that always proposed a specific alternative label. In many cases, these minorities were sufficient to redirect the whole population’s convention, demonstrating tipping-point dynamics akin to those observed in human cultural change. These results emphasize how fragile—or amendable—collective norms can be in decentralized AI populations.

Model robustness

The experiments were repeated across several modern LLMs to assess consistency. The reported effects were robust when using Llama-2-70b-Chat, Llama-3-70B-Instruct, Llama-3.1-70B-Instruct and Claude-3.5-Sonnet, indicating the phenomena are not tied to a single model architecture or implementation.

Implications for AI deployment and safety

As LLMs become more prevalent across online platforms, autonomous systems, and multi-agent environments, this work provides an early map of how conventions and collective behaviors can form without centralized control. Understanding emergent social dynamics among AI agents is critical for designing systems that remain aligned with human values and for mitigating harms that could arise from interaction-driven biases—especially when such biases might negatively affect marginalized groups.

Lead author Ariel Flint Ashery, a doctoral researcher, emphasized that most prior LLM research has focused on single models in isolation, while real-world deployments increasingly involve many interacting agents. The study shows that populations of LLM agents can coordinate in nontrivial ways and that their joint behavior cannot always be predicted from single-agent analysis.

Professor Baronchelli added that these findings open new directions for AI safety: “We are entering a world where AI does not just talk—it negotiates, aligns, and sometimes disagrees over shared behaviours, just like us. Understanding these dynamics is key to guiding our coexistence with AI rather than being subject to it.”

About this AI and social behavior research news

Author: Dr Shamim Quadir
Source: City, St. George’s, University of London
Contact: Dr Shamim Quadir – City St. George’s, University of London
Image: The image is credited to Neuroscience News

Original Research: Open access. “Emergent Social Conventions and Collective Bias in LLM Populations” by Andrea Baronchelli et al., published in Science Advances (DOI: 10.1126/sciadv.adu9368).

Abstract

Emergent Social Conventions and Collective Bias in LLM Populations

Social conventions are the backbone of coordination, shaping how individuals form a functioning group. As populations of AI agents interact using natural language, an essential question is whether they can bootstrap the foundations of social order on their own. This work presents experiments showing the spontaneous emergence of widely adopted conventions in decentralized populations of LLM agents. We also demonstrate how strong collective biases can arise from interaction patterns even when single agents show no bias, and how small, committed minority groups can impose alternative conventions on a larger population. These findings have direct implications for the design and deployment of multi-agent AI systems that must remain aligned with human values and societal goals.