The Information reports that OpenAI has cut inference costs by more than half on some existing models, while logged-out …
Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-30
The Information reports OpenAI has cut inference costs by more than half on some existing models through optimization techniques, as OpenAI's adjusted gross margin fell to 33% in 2025 before targeting 52% by year-end 2026, with Anthropic sitting at approximately 44%.
Appears in
Extraction
Topics: openaiinference-costsai-economicsmodel-efficiencygross-margins
Claims
- OpenAI has cut inference costs by more than half on some existing models, according to The Information.
- Logged-out ChatGPT traffic ran on only a couple hundred Nvidia GPUs, suggesting significant efficiency in serving unauthenticated users.
- OpenAI's adjusted gross margin fell to 33% in 2025 from 40% in 2024 after inference costs quadrupled, recovering to 39% in Q1 2026 with a 52% target by year-end.
- Anthropic's gross margin is approximately 44%, indicating that frontier labs remain well below mature software economics.
Key quotes
OpenAI has cut inference costs by more than half on some existing models, while logged-out ChatGPT traffic ran on only a couple hundred Nvidia GPUs.
OpenAI's adjusted gross margin fell to 33% in 2025 from 40% in 2024, after inference costs quadrupled.
Lower cost can raise margins, expand usage limits, or reduce pressure on API pricing.