What if most RL gains come from 1 transformer layer?

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-07-03

A new arXiv paper titled 'Is One Layer Enough?' finds that training a single middle transformer layer can match or beat full-parameter RL post-training, suggesting RL gains are concentrated in specific layers rather than distributed across the network.

Open original ↗

Appears in

Wave of Research Advances in RL Post-Training Methods for LLMs

Extraction

Topics: reinforcement-learningtransformer-architecturellm-post-trainingparameter-efficiencylayer-analysis

Claims

RL post-training gains are concentrated in specific transformer layers rather than distributed evenly across the network.
Training a single transformer layer while freezing all others can recover most of the performance gain from full RL training.
The most impactful layers for RL training are typically located near the middle of the network, while early and late layers contribute less.
Training only the best middle layers can surpass full RL training, achieving 69.1 math accuracy versus 66.4 on Qwen3-8B.
This finding holds across 7 models, 3 RL methods, and tasks spanning math, code, and agent benchmarks.

Key quotes

What if most RL gains come from 1 transformer layer?

The useful RL changes are not spread evenly through the network.

Training only the best middle layers can beat full RL, such as 69.1 math accuracy versus 66.4 on Qwen3-8B.