Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning Paper • 2508.09726 • Published Aug 13, 2025 • 15
AdoraRL/Qwen2.5-7B-Instruct-1M-KK-5ppl-100step-ADORA Text Generation • 8B • Updated Apr 3, 2025 • 6 • 1
AdoraRL/Qwen2.5-7B-Instruct-1M-KK-5ppl-100step-ADORA Text Generation • 8B • Updated Apr 3, 2025 • 6 • 1
AdoraRL/Qwen2.5-7B-Instruct-1M-KK-5ppl-100step-ADORA Text Generation • 8B • Updated Apr 3, 2025 • 6 • 1