GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts Paper • 2509.25160 • Published Sep 29, 2025 • 30
Hierarchical Budget Policy Optimization for Adaptive Reasoning Paper • 2507.15844 • Published Jul 21, 2025 • 16
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization Paper • 2507.15758 • Published Jul 21, 2025 • 35
GUI-G^2: Gaussian Reward Modeling for GUI Grounding Paper • 2507.15846 • Published Jul 21, 2025 • 133