What if most RL gains come from 1 transformer layer?
Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-07-03
A new arXiv paper titled 'Is One Layer Enough?' finds that training a single middle transformer layer can match or beat full-parameter RL post-training, suggesting RL gains are concentrated in specific layers rather than distributed across the network.
Appears in
Extraction
Topics: reinforcement-learningtransformer-architecturellm-post-trainingparameter-efficiencylayer-analysis
Claims
- RL post-training gains are concentrated in specific transformer layers rather than distributed evenly across the network.
- Training a single transformer layer while freezing all others can recover most of the performance gain from full RL training.
- The most impactful layers for RL training are typically located near the middle of the network, while early and late layers contribute less.
- Training only the best middle layers can surpass full RL training, achieving 69.1 math accuracy versus 66.4 on Qwen3-8B.
- This finding holds across 7 models, 3 RL methods, and tasks spanning math, code, and agent benchmarks.
Key quotes
What if most RL gains come from 1 transformer layer?
The useful RL changes are not spread evenly through the network.
Training only the best middle layers can beat full RL, such as 69.1 math accuracy versus 66.4 on Qwen3-8B.