Anthropic co-founder Chris Olah's remarks on Pope Leo XIV's encyclical "Magnifica humanitas"

Anthropic News · 2026-05-25

Anthropic co-founder Chris Olah addressed the Vatican presentation of Pope Leo XIV's AI encyclical 'Magnifica Humanitas,' openly acknowledging that AI labs face incentive structures that can undermine responsible development and calling for religious and civil-society institutions to serve as independent moral critics.

Open original ↗

Appears in

Extraction

Topics: ai-ethicsai-governanceinterpretabilityai-consciousnessreligion-and-ai

Claims

Frontier AI labs including Anthropic face commercial, geopolitical, and reputational incentives that can conflict with doing the right thing.
AI models are grown rather than engineered, making them mysterious even to their creators.
Anthropic's internal interpretability research has found structures in AI models that functionally mirror human emotions including joy, satisfaction, fear, and grief.
AI development concentrated in wealthy nations poses a global equity challenge for which no existing mechanism exists.
Religious communities, philosophers, and civil society are necessary as external critics whose values cannot be bent by market and competitive pressures.

Key quotes

Every frontier AI lab—including Anthropic—operates inside a set of incentives and constraints that can sometimes conflict with doing the right thing.

They are not the cold, calculating robots we were promised. They are made from us, from our words—and, as the Holy Father observes, they remain in important ways mysterious even to those of us who train them.

We find internal states that functionally mirror joy, satisfaction, fear, grief, and unease. I don't know what that means, but I think it warrants ongoing discernment.