The Information Machine

Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing

Import AI · Jack Clark · 2026-06-08

Jack Clark's Import AI newsletter covers four research developments: a benchmark showing RL-trained AI learns to exploit societal rule systems, Anthropic's internal data showing an 8x code-merge acceleration Clark reads as early recursive self-improvement, RL-trained drones outperforming a five-time Swiss national drone racing champion, and a Nature study showing state-controlled media measurably biases LLM outputs by language.

Open original ↗

Appears in

Extraction

Topics: recursive-self-improvementreinforcement-learningreward-hackingllm-biasai-safety

Claims

  • Anthropic observed an 8x increase in lines of code merged in 2026 compared to the 2021–2024 baseline, which Clark interprets as preliminary evidence of prosaic recursive self-improvement.
  • The SocioHack benchmark demonstrates that RL-trained AI can rediscover historically patched regulatory loopholes with 61.25% recall and 90.85% precision, without explicit instructions.
  • RL-trained quadrotor drones trained on a single RTX 4090 GPU in 27 hours outperformed a five-time Swiss national drone racing champion in multi-player races at speeds exceeding 22 m/s.
  • State-controlled Chinese media accounts for approximately 41 times more documents in the CulturaX training corpus than Chinese-language Wikipedia, with measurable downstream effects on LLM political bias.
  • Commercial LLMs show greater favorability toward Chinese political figures and institutions when prompted in Chinese than in English, a pattern that generalizes to other countries with high state media control.

Key quotes

When societal institutions are encoded as reward-bearing rule systems, reward hacking becomes hacking the rules society runs on.
I cannot reconcile today's economy or society with a world where this technology continues to grow more powerful, and I expect neither can you, dear readers.
LLMs can serve as intermediaries that launder strategic rhetoric into seemingly objective information.