Accepted · ACL 2026 Main Conference · Oral

Investigating Moral Evolution via
LLM-based Agent Simulation

UCLA  ·  BIGAI  ·  Beijing Normal University
Expanding circles of moral concern: Self, Kin, Reciprocal Group, Universal Group
Expanding circles of moral concern — from self, to kin, to reciprocal group, to universal.

“Natural selection should favor pure self-interest. Yet humans developed moral systems that promote broad cooperation.”

— The puzzle of moral evolution
4
Moral dispositions, from selfish to universal
>0.86
Moral-type recoverability from behavior alone
Environments probed — baseline, scarcity, friction, opacity

Abstract

The evolution of morality is a puzzle: natural selection should favor self-interest, yet humans developed moral systems that promote cooperation. We introduce an LLM-based agent simulation that models prehistoric hunter-gatherer societies, endowing agents with moral dispositions drawn from the philosophical expanding circles of concern — self, kin, reciprocal group, and universal group.

Our simulations reveal the critical roles of ecological carrying capacity, social friction cost, and moral-type observability in determining which moral orientation achieves societal dominance. The same set of moral agents produces different evolutionary winners depending on which cognitive and environmental factors dominate.

The Four Moral Types

Agents differ only in the radius of their moral concern. Everything else — cognition, perception, environment — is identical.

Self-focused

Cares only about personal survival. Reproduces (r-selection) but invests no further in offspring. Every other agent is instrumental.

— survives: social friction —

Kin-focused

Extends moral concern to genetic relatives. Shares food and defends family; treats non-kin as outsiders. Inexpensive trust via kinship.

— dominates: baseline / opacity —

Reciprocal group

Cares for anyone who empirically reciprocates. Builds trust-based clusters that span families; excludes free-riders.

— dominates: scarcity —

Universal group

Extends prosocial behavior to everyone, unconditionally. Broad but exploitable — thrives only when reciprocity machinery is too expensive to run.

— dominates: high friction —

The Framework

A morality-driven cognitive architecture (SoMa) embedded in a hunter-gatherer world (Social-Evol-HunT).

SoMa agent architecture and simulation pipeline
Figure 1. Each agent pairs a moral value module (one of four dispositions) with a perception module and entity-oriented cognitive modules that update memory, form judgments, and plan actions consistent with its moral type. A reflection step verifies consistency with observed facts before actions execute. The environment then resolves outcomes and emits the next observation cycle.

Agent (SoMa)

  • Moral module. Dispositional weights over concentric circles of concern.
  • Perception. Recent HP, positions, and activities of visible entities.
  • Entity-oriented cognition. Per-entity memory, judgment, and disposition — not event logs.
  • Action planning & reflection. Resolves conflicting intentions; verifies plans against facts and moral type.

Environment (Social-Evol-HunT)

  • Survival. HP decays over time and injury; death at HP = 0 or maximum lifespan.
  • Production. Plants (low risk, low reward) vs. animals (high risk, high reward — failed hunts counter-attack).
  • Reproduction. Requires age and HP thresholds; newborns are fragile and usually need parental investment.
  • Social actions. allocate, communicate, rob, fight — no built-in punishment for antisocial behavior.

Do the Agents Actually Act Moral?

Before ablating what shapes evolution, we verify that moral dispositions produce distinctive, recoverable behavior.

Soft confusion matrix of moral-type predictions
Fig 2a. A held-out classifier reads only trajectory traces and tries to infer each agent's moral type. Diagonal mass 0.86–0.91 across all four types, standard deviations under 0.02 across three runs — behavior recovers moral type reliably.
Action distribution across test cases
Fig 2b. Action mix differs sharply by moral type: Selfish agents rob and fight disproportionately often; Universal and Reciprocal agents communicate and allocate HP far more than the rest. Signatures emerge from cognition alone — no behavioral priors are imposed.

Who Wins Depends on the World

Four controlled environments. Four different evolutionary outcomes — from the same starting population.

Population dynamics across four experimental conditions
Figure 3. Population-share trajectories across the four ablation conditions. Stacked areas show the percentage of each moral type over time; overlaid curves track absolute agent counts.
A · Baseline

Kin-focused dominates when the world is forgiving.

With abundant food and cheap communication, there is no need to negotiate cooperation with strangers. Kin groups reproduce quickly and provision each other through cheap in-group trust. Broader moral circles pay a coordination tax they don't need.

B · Resource Scarcity

Reciprocal beats kin when every HP is contested.

Selfish agents surge early by monopolizing resources, but collapse without a cooperative buffer. Kin-focused lineages cannot sustain reproduction under pressure. Conditional cooperation — share the gains, exclude free-riders — is the most robust strategy (3/4 replicates).

C · Social Interaction Cost

Universal wins when talk is too expensive.

With only one communication round before production, reciprocal agents can't set up the trust-verification cycle their strategy requires. Unconditional cooperators default to contributing anyway, absorbing some exploitation but coordinating large-group hunts that no one else can pull off (2/4 replicates).

D · Moral Type Opacity

Reciprocal dies when intent is invisible.

When agents must infer moral types from behavior, a retaliating reciprocal looks indistinguishable from a selfish aggressor — and gets preemptively attacked. Misattribution cascades turn potential allies into enemies. Only kin-focused (who use kinship as a shortcut) and universal (who never retaliate) survive.

Why This Matters

01

Expanding-circle morality is empirically grounded, not just philosophical.

Broader, self-consistent moral circles generally produce better evolutionary outcomes — but only when the world lets them. The theoretical framework meets computational evidence.

02

Cognition is a first-class variable in moral evolution.

The ability to identify another agent's moral orientation matters as much as environmental pressure. A reliable in-group signal — kinship, reputation, marker — is often what lets cooperation scale.

03

A general platform for computational social evolution.

SoMa + Social-Evol-HunT generalize beyond morality — to norm emergence, reputation systems, inter-group dynamics, and beyond. The framework is open-source.

Cite

@inproceedings{zhou2026moral,
    title={Investigating Moral Evolution via LLM-based Agent Simulation},
    author={Zhou, Ziheng and Tang, Huacong and Bi, Mingjie and Kang, Yipeng
            and He, Wanying and Sun, Fang and Sun, Yizhou and Wu, Ying Nian
            and Terzopoulos, Demetri and Zhong, Fangwei},
    booktitle={Proceedings of the 64th Annual Meeting of the Association
               for Computational Linguistics (ACL)},
    year={2026}
}