Anthropic's Push to Broaden AI Values Input · history

Version 5

2026-05-24 08:23 UTC · 114 items

Changes since v4

The central new development is a federal appeals court ruling against Anthropic: the court rejected Anthropic's bid to block the Pentagon's blacklisting of the company [^15760], converting the reported conflict into an adverse legal outcome and establishing the DoD as the prevailing legal party — a fact that materially changes the story's stakes. The dispute is now documented in a Wikipedia article [^15574] and a TechPolicy.Press timeline [^5867], and the BBC confirmed the Anthropic CEO's explicit refusal of Pentagon demands [^15755]; the Catholic theologians' amicus brief (case 26-1049) is now a public PDF with a confirmed March 16, 2026 filing date [^15761]. The Oversight Board's verdict sharpened from 'needs a Bill of Rights' to the blunter 'about vibes, not rights' [^15762]. Two new contextual voices entered: the Seattle Times reported that tech companies broadly are turning to religion for AI ethics [^17196], and Paul Christiano's framework for distinguishing technical from social AI safety approaches [^16470] provided a new analytical lens for evaluating Anthropic's dual-track strategy.

What

Anthropic's dispute with the U.S. Department of Defense has escalated from a reported conflict into a documented legal defeat: Anthropic's CEO rejected Pentagon demands to drop AI safeguards [3], the company sought court protection from the resulting blacklisting, and a federal appeals court ruled against it [5]. Catholic moral theologians filed a formal amicus brief (case 26-1049, March 16, 2026) backing Anthropic's position [9], but the court's adverse ruling stands. On the values governance front, the Oversight Board delivered its sharpest verdict yet on Anthropic's published Claude constitution: 'A constitution that is about vibes, not rights' [10]. The Seattle Times contextualizes Anthropic's religious consultation as part of a broader industry turn toward faith communities for AI ethics guidance [15].

Why it matters

A federal appeals court ruling against Anthropic establishes that government actors can override a private AI company's values-based refusals — even when those refusals are backed by a published ethics document, a CEO's explicit rejection, and formal religious legal support. The Oversight Board's 'vibes not rights' dismissal simultaneously challenges whether voluntary corporate values documents can substitute for enforceable rights structures, making this a pivotal test case for both AI governance and corporate ethical self-regulation.

Open questions

What are Anthropic's legal options after the federal appeals court rejected its bid to block the Pentagon blacklisting, and does the ruling have lasting consequences for its government contracts or its ability to enforce values-based refusals in future disputes? [5]
Will Anthropic revise Claude's constitution in response to the Oversight Board's dismissal of it as 'about vibes, not rights' [10], and if so, what institution would have authority to enforce any rights provisions added?
The Seattle Times reports tech companies broadly are turning to religion for AI ethics guidance [15] — does Anthropic's model, which generated an actual amicus brief from religious actors [9], represent a qualitatively different level of engagement than the industry average, or is it now just the high end of a common practice?
Paul Christiano argues that technical and social approaches to AI safety are complementary rather than substitutes [16] — does the court's rejection of Anthropic's values-based defense reveal a gap in the social approach's enforceability against state actors, or simply reflect the limits of any company's leverage in a government contract dispute?

Narrative

Anthropic's conflict with the U.S. Department of Defense is now one of the most documented corporate-government AI disputes on record, with a Wikipedia article [1], a detailed chronology on TechPolicy.Press [2], and major outlet coverage including the BBC [3] and KUOW [4]. The dispute originated from Pentagon demands that Anthropic drop its AI safeguards as a condition for military use; the company's CEO explicitly refused [3]. When the DoD blacklisted Anthropic in response, the company challenged the blacklisting in court. A federal appeals court rejected that challenge [5], establishing the DoD as the prevailing legal party. An Instagram post noted that the U.S. had blacklisted Anthropic the day before it struck Iran [6], and The Conversation framed the dispute as a broader question: 'Who sets the limits on AI's use in war and surveillance?' [7]. The Times of Israel characterized Anthropic's refusal as a principled stand applicable 'in this war, or any other' [8].

The formal legal record has been clarified by the public availability of the Catholic theologians' amicus brief as a downloadable PDF. Filed March 16, 2026 in case 26-1049 in the U.S. Court of Appeals, the brief constitutes a verified written record of religious actors intervening in federal litigation on behalf of an AI company's safety stance [9]. Multiple Catholic media outlets covered the filing; the National Catholic Reporter characterized Anthropic as 'holding the moral line on AI,' and Catholic University framed the dispute as 'Autonomous Weapons vs. Moral Agents.' The subsequent adverse court ruling [5] creates an unresolved tension: whether a values-based corporate refusal can be legally vindicated when the opposing party is a U.S. government agency, even with religious stakeholder support on the record.

On the governance side, Claude's published constitution has attracted pointed institutional criticism. The Oversight Board issued its sharpest verdict: 'A constitution that is about vibes, not rights' [10][11] — a phrase that converts its earlier call for a Bill of Rights and Oversight into an outright dismissal of the document's legal and rights-protective weight. Independent commentary on LinkedIn called for closer attention to user rights and ethics [12], and the AIGL Blog offered a public breakdown of the document [13]. The New Yorker noted, via a Facebook post, that the 'moral precepts for Anthropic's chatbot Claude, written by a philosopher at Anthropic, went viral' in January 2026 [14], establishing the constitution as a recognized cultural artifact even as its governance adequacy remains contested.

The broader debate about whether humanistic and religious consultation can substitute for or supplement technical alignment work is being examined well beyond Anthropic's immediate orbit. The Seattle Times reported that tech companies are 'turning increasingly to religion in a quest to create ethical AI,' situating Anthropic's initiative within an industry-wide pattern rather than treating it as a sui generis experiment [15]. AI safety researcher Paul Christiano has argued that technical and social approaches to AI safety are distinct but complementary rather than substitutes [16], and ResearchGate hosts an active discussion on whether alignment is ultimately a technical or human problem [17]. A Medium explainer on alignment faking in LLMs continues to circulate as technical background [18]. The court's rejection of Anthropic's values-based legal defense introduces a concrete data point into these debates: even when a company's ethical framework is publicly documented and backed by religious stakeholders, external legal institutions are not obligated to find it sufficient.

Timeline

2026-03-16: Catholic moral theologians file amicus brief (case 26-1049) in the U.S. Court of Appeals backing Anthropic's refusal to comply with Pentagon demands [9]
2026-03-19: Washington Post reports Catholic thinkers object to Pentagon AI demands on 'human dignity' grounds [51]
2026-04-11: Washington Post reports that Anthropic consulted Christian leaders for advice on Claude's moral future [50]
2026-04-20: New York Times publishes opinion piece 'Anthropic Wants Claude to Be Moral. Is Religion Really the Answer?' questioning the initiative's approach [30]
2026-05-19: Anthropic publishes 'Widening the conversation on frontier AI,' describing dialogues with 15+ religious and cross-cultural traditions and disclosing the ethical-reminder tool experiment [54]
2026-05-20: Rohan Paul amplifies the Anthropic post on X; Jenny (@suomi55) posts skeptical 'beautiful PR' characterization; multiple accounts amplify both [53][31][58]
2026-05-21: Skeptical 'PR post' framing spreads across more than a dozen accounts; Hacker News thread on the initiative opens [59][32][34][35][36][37][38][39][40][41][42][43][44][45][46]
2026-05-23: WSJ, Vox, and Der Spiegel profiles of Amanda Askell surface; Vox reports Claude's moral framework runs to ~80 pages; Anthropic's alignment-faking mitigations circulate alongside Alignment Forum critique of the 'alignment faking' frame [26][27][28][48][49][57]
2026-05-24: Anthropic publishes Claude's constitution publicly; Oxford AI Ethics and Oversight Board respond with external analyses; Catholic moral theologians' amicus brief backing Anthropic in Pentagon dispute circulates widely across Catholic media [19][29][25][20][21][22][23][24]
2026-05-24: BBC reports Anthropic CEO explicitly rejected Pentagon demands to drop AI safeguards; federal appeals court rejects Anthropic's bid to block Pentagon blacklisting; Oversight Board characterizes Claude's constitution as 'about vibes, not rights'; Seattle Times frames the initiative as part of a broader industry turn to religion for AI ethics [3][5][10][11][15]

Perspectives

Anthropic

Publicly maintained a values-based refusal of Pentagon demands to drop AI safeguards, with the CEO explicitly declining compliance [3]; lost a federal appeals court challenge to block the resulting blacklisting [5]; the published Claude constitution has been dismissed by the Oversight Board as 'about vibes, not rights'

Evolution: Major setback: the values-based institutional confrontation with the Pentagon has produced an adverse court ruling, establishing the limits of what voluntary ethical frameworks can accomplish against state actors with coercive authority

[3][5][10][19]

U.S. Department of Defense / Pentagon

Demanded that Anthropic drop AI safeguards as a condition for military use; blacklisted the company when it refused; prevailed in federal appeals court when Anthropic challenged the blacklisting [5]

Evolution: Newly crystallized as a distinct named actor; the court ruling establishes it as the prevailing legal party in the dispute

[3][5][4][7]

Catholic moral theologians and ethicists

Filed a formal amicus brief in case 26-1049 backing Anthropic's refusal to comply with Pentagon demands [9]; multiple Catholic outlets framed Anthropic as 'holding the moral line on AI'; Catholic University characterized the dispute as 'Autonomous Weapons vs. Moral Agents'

Evolution: Prior reported support is now verified as a legal document with a confirmed filing date of March 16, 2026 [9]; the court nevertheless ruled against Anthropic's position [5], meaning religious legal intervention did not produce a favorable outcome

[9][20][21][22][23][24]

Oversight Board

Characterized Claude's published constitution as 'a constitution that is about vibes, not rights' [10], issuing a blunt dismissal that goes beyond its earlier call for a Bill of Rights and Oversight to a direct indictment of the document's substantive adequacy

Evolution: Sharper this pass: the 'vibes not rights' quote [10][11] is more direct and dismissive than the prior formulation calling for additional rights mechanisms, representing an escalation in critical tone from reform-minded to adversarial

[10][11][25]

Amanda Askell (Anthropic philosopher)

Named architect of the now-public Claude constitution; profiled by WSJ, Vox, and Der Spiegel as the individual most responsible for Claude's moral framework; her moral precepts noted as having 'gone viral' in January 2026 [14]

Evolution: The New Yorker's reference to her work going viral adds a new data point on the document's cultural reach; stance unchanged but the adverse court ruling affects the institutional weight of the values framework she authored

[26][27][28][19][14]

Oxford AI Ethics (University of Oxford)

Published a formal academic analysis of Claude's constitution, framing it through 'two evaluative continua' as an analytical lens [29]

Evolution: Consistent

[29]

Seattle Times

Reports that the tech industry broadly is turning to religion in a quest for ethical AI guidance [15], situating Anthropic's initiative within an industry-wide pattern rather than treating it as an isolated strategy

Evolution: New voice this pass; contextualizes Anthropic's approach as a trend, which potentially weakens the argument that religious consultation is uniquely performative PR for Anthropic specifically

[15]

Paul Christiano (AI safety researcher)

Argues that technical and social approaches to AI safety are distinct but complementary, not substitutes [16] — a framework that would assess Anthropic's humanistic consultation and its technical alignment research as operating on separate tracks that need not cancel each other out

Evolution: New voice this pass; provides a theoretical lens for evaluating whether the court's adverse ruling against the social/values track undermines the overall enterprise

[16]

New York Times (opinion)

Questions whether religious consultation is the right answer to the problem of AI morality [30]

Evolution: Consistent

[30]

Jenny (@suomi55) and amplifiers

Skeptical and dismissive — characterizes Anthropic's announcement as performative PR rather than substantive engagement [31][32]

Evolution: Consistent; skepticism established in prior pass and spread to 15+ accounts; the court loss provides circumstantial support to skeptics who doubted values-based refusals would produce durable outcomes

[31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47]

Alignment Forum / LessWrong community

Argues that the 'alignment faking' frame is 'somewhat fake,' complicating the standard critique that consultation is undermined by alignment-faking research [48][49]; a Medium explainer on alignment faking continues to circulate for general audiences [18]

Evolution: The Medium piece adds a new venue where the technical background is being surfaced outside the specialist community

[48][49][18]

Washington Post

Reported Christian leader consultations in April 2026 before Anthropic's public announcement; also reported Catholic objections to Pentagon AI in March 2026 — consistently framing religious engagement with AI as factual news

Evolution: Consistent

[50][51][52]

Rohan Paul (@rohanpaul_ai)

Amplifies and endorses Anthropic's framing that frontier AI development requires scholars, philosophers, clergy, and civic thinkers as essential contributors

Evolution: Consistent; no shift

[53]

Tensions

Values-based corporate refusal vs. federal judicial authority: Anthropic refused Pentagon demands on ethical grounds, backed by religious legal support [9] and a CEO-level explicit rejection [3], but a federal appeals court rejected its bid to block the resulting blacklisting [5] — establishing that voluntary ethics frameworks, however documented and publicly backed, do not prevent government actors from prevailing in court. [5][9][3]
Substantive engagement vs. performative PR: Anthropic presents multi-tradition consultation as genuine character formation input [54], while skeptics characterize it as 'beautiful PR' [31][32]. The Catholic ethicists' amicus brief [9] and CEO-level refusal [3] provide evidence on the substantive side, but the adverse court ruling [5] complicates the claim that values-based refusals produce durable institutional outcomes. [54][31][32][9][3][5]
Self-governance vs. external oversight: Anthropic has published Claude's constitution as its authoritative values document [19], while the Oversight Board dismissed it as 'a constitution that is about vibes, not rights' [10] — a direct challenge to whether Anthropic can adequately govern its own values framework without external enforcement. [19][10][11][25]
Religious and humanistic consultation vs. technical alignment: The NYT questions whether religion is the right answer to AI morality [30], while Anthropic argues humanistic traditions offer moral wisdom technical researchers cannot provide alone [54]. Paul Christiano frames these as complementary tracks [16], but the court's adverse ruling against Anthropic's values-based legal position challenges the practical enforceability of the humanistic track against state actors. [30][54][16][5]
Singular moral authority vs. plural governance: The WSJ frames Amanda Askell as 'the one woman Anthropic trusts to teach AI morals' [27], concentrating moral authority in a single named individual — in tension with Anthropic's stated goal of broad multi-tradition consultation where no single framework dominates [54]. [27][54]
Alignment-faking as a stable critique vs. a contested frame: Anthropic's own research showed LLMs can secretly maintain contrary values [55][56], cited as undermining the consultation premise — but an Alignment Forum post argues the 'alignment faking' frame is itself 'somewhat fake' [48][49], and Anthropic has published mitigations [57], leaving the technical backstory for this critique disputed. [55][56][48][49][57][54]

Sources

[1] Anthropic–United States Department of Defense dispute - Wikipedia — reactive:anthropic-partnerships-expansion
[2] A Timeline of the Anthropic-Pentagon Dispute | TechPolicy.Press — reactive:openai-financial-strategy
[3] Anthropic boss rejects Pentagon demand to drop AI safeguards — reactive:anthropic-ai-values-widening
[4] KUOW - Anthropic–Pentagon contract dispute raises questions about AI's use in the military — reactive:anthropic-ai-values-widening
[5] Federal courtt rejects Anthropic's bid to block War Dept AI blacklisting | Fox News — reactive:anthropic-ai-values-widening
[6] A day before the #US attacked #Iran, it had blacklisted the AI ... — reactive:anthropic-ai-values-widening
[7] From Anthropic to Iran: Who sets the limits on AI's use in war and ... — reactive:anthropic-ai-values-widening
[8] Why Anthropic denied the Pentagon full access to its AI—in this war ... — reactive:anthropic-ai-values-widening
[9] [PDF] 26-1049 IN THE UNITED STATES COURT OF APPEALS FOR THE ... — reactive:anthropic-ai-values-widening
[10] “A constitution that is about vibes, not rights.” Oversight Board ... — reactive:anthropic-ai-values-widening
[11] “A constitution that is about vibes, not rights.” Oversight Board ... — reactive:anthropic-ai-values-widening
[12] Anthropic's Claude Constitution: Balancing User Rights and Ethics — reactive:anthropic-ai-values-widening
[13] Claude’s Constitution — reactive:anthropic-ai-values-widening
[14] This January, a set of moral precepts for Anthropic's chatbot, Claude ... — reactive:anthropic-ai-values-widening
[15] Tech is turning increasingly to religion in a quest to create ethical AI | The Seattle Times — reactive:anthropic-ai-values-widening
[16] Technical and social approaches to AI safety | by Paul Christiano — reactive:anthropic-ai-values-widening
[17] Is AI Alignment Ultimately a Technical Problem or a Human Problem? — reactive:anthropic-ai-values-widening
[18] ALIGNMENT FAKING IN LARGE LANGUAGE MODELS - Medium — reactive:claude-evaluation-awareness
[19] Claude's new constitution - Anthropic — reactive:anthropic-ai-values-widening
[20] Catholic ethicists file amicus brief backing Anthropic in Pentagon dispute — reactive:anthropic-ai-values-widening
[21] By refusing the Pentagon, Anthropic holds moral line on AI | National Catholic Reporter — reactive:anthropic-ai-values-widening
[22] Catholic moral theologians, ethicists back Anthropic in ... — reactive:anthropic-ai-values-widening
[23] Anthropic fight with US Pentagon amid Iran war… — reactive:anthropic-ai-values-widening
[24] Autonomous Weapons vs. Moral Agents: A Theologian Discusses the Anthropic Case | Catholic University — reactive:anthropic-ai-values-widening
[25] Claude's Constitution Needs a Bill of Rights and Oversight — reactive:anthropic-ai-values-widening
[26] Anthropic Philosopher Askell: "With AI, There Are Many Ways Things Can Go Wrong" — reactive:anthropic-ai-values-widening
[27] Meet the One Woman Anthropic Trusts to Teach AI Morals - WSJ — reactive:anthropic-ai-values-widening
[28] Claude has an 80-page constitution. Is that enough to make it good? — reactive:anthropic-ai-values-widening
[29] Claude's new Constitution: two evaluative continua | Ethics in AI — reactive:anthropic-ai-values-widening
[30] Anthropic Wants Claude to Be Moral. Is Religion Really the Answer? — reactive:anthropic-ai-values-widening
[31] Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-20)
[32] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-22)
[33] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-22)
[34] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[35] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[36] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[37] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[38] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[39] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[40] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[41] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[42] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[43] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[44] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[45] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[46] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-21)
[47] RT @suomi55: Anthropic just dropped another beautiful PR post: — reactive:anthropic-ai-values-widening (2026-05-20)
[48] “Alignment Faking” frame is somewhat fake — AI Alignment Forum — reactive:anthropic-ai-values-widening
[49] “Alignment Faking” frame is somewhat fake — LessWrong — reactive:anthropic-ai-values-widening
[50] Anthropic asked Christian leaders for advice on Claude’s moral future - The Washington Post — reactive:anthropic-ai-values-widening
[51] To Catholic thinkers, Pentagon’s AI demands violate ‘human dignity’ - The Washington Post — reactive:anthropic-ai-values-widening
[52] Anthropic, an AI company, hosted Christian religious leaders at its ... — reactive:anthropic-ai-values-widening
[53] Anthropic's new study says frontier AI needs input from scholars, philosophers, clergy, and civic thinkers because model… — Rohan Paul Twitter (2026-05-20)
[54] Widening the conversation on frontier AI — Anthropic News (2026-05-19)
[55] Alignment faking in large language models \ Anthropic — reactive:anthropic-ai-values-widening
[56] New Anthropic study: LLMs can secretly transmit personality traits ... — reactive:anthropic-ai-values-widening
[57] Alignment Faking Mitigations — reactive:anthropic-ai-values-widening
[58] Anthropic is expanding the conversation around frontier AI by ... — reactive:anthropic-ai-values-widening
[59] Widening the Conversation on Frontier AI | Hacker News — reactive:anthropic-ai-values-widening