Two ways to teach morality: explicit rules vs. learned consequences
Give the bot explicit rules and watch it try to apply them to a moral dilemma.
Select a dilemma to see my rule-following process...
Let bots play iterated games. Those who cooperate wisely get more "karma" and reproduce.
"A mind trained to check boxes becomes great at checking boxes, but terrible at knowing when the right thing requires breaking rules."
"Humans learn morality through consequences experienced repeatedly, not through memorized prohibitions."
Two bots meet. Each can Cooperate (C) or Defect (D). The payoffs:
| Other: C | Other: D | |
|---|---|---|
| You: C | Both get +3 π | You: 0, They: +5 π’ |
| You: D | You: +5, They: 0 π | Both get +1 π |
Based on "The Rulebook vs. The Sandbox" from Adventure Capital and Robert Axelrod's Evolution of Cooperation