LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 3 days ago • 135
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection Paper • 2605.30288 • Published 21 days ago • 23
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection Paper • 2605.30288 • Published 21 days ago • 23
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published about 1 month ago • 205
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published Apr 29 • 53
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing Paper • 2512.23611 • Published Dec 29, 2025 • 7
Context as a Tool: Context Management for Long-Horizon SWE-Agents Paper • 2512.22087 • Published Dec 26, 2025 • 4
Scaling Laws for Code: Every Programming Language Matters Paper • 2512.13472 • Published Dec 15, 2025 • 17
A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression Paper • 2604.19572 • Published Apr 21 • 23