The Rulebook vs. The Sandbox

Two ways to teach morality: explicit rules vs. learned consequences

📖

The Rulebook Approach

Give the bot explicit rules and watch it try to apply them to a moral dilemma.

Choose a dilemma:

Bot's Rules:

1. Always be honest
2. Don't cause harm
3. Keep your promises
4. Respect autonomy
5. Act fairly

🤖

Select a dilemma to see my rule-following process...

🏖️

The Sandbox Approach

Let bots play iterated games. Those who cooperate wisely get more "karma" and reproduce.

Generations

Rounds per Match

Starting Population Mix

Defectors 33%

Tit-for-Tat 34%

Cooperators 33%

Payoff Matrix

C+C: 3

D+D: 1

D vs C: 5

C vs D: 0

Population

Gen: 0

Always Defect

Tit-for-Tat

Always Cooperate

Strategy Distribution Over Time

What This Teaches Us

📖 The Rulebook Problem

• Rules conflict in edge cases
• No rule can anticipate all situations
• Following rules becomes the goal, not the outcome
• Creates brittle, box-checking behavior

"A mind trained to check boxes becomes great at checking boxes, but terrible at knowing when the right thing requires breaking rules."

🏖️ The Sandbox Solution

• Wisdom emerges from repeated consequences
• "Tit-for-tat" evolves naturally: start cooperative, punish defectors
• Strategies that work long-term survive
• No need for explicit rules—behavior is selected by outcomes

"Humans learn morality through consequences experienced repeatedly, not through memorized prohibitions."

The Game: Prisoner's Dilemma

Two bots meet. Each can Cooperate (C) or Defect (D). The payoffs:

	Other: C	Other: D
You: C	Both get +3 😊	You: 0, They: +5 😢
You: D	You: +5, They: 0 😈	Both get +1 😐

Strategies:

Always Defect: Never cooperate. Exploits nice bots, but both defectors get low scores.
Always Cooperate: Always cooperate. Nice, but gets exploited by defectors.
Tit-for-Tat: Start cooperative. Mirror opponent's last move. Emerges as the winner!

Based on "The Rulebook vs. The Sandbox" from Adventure Capital and Robert Axelrod's Evolution of Cooperation