π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 6 days ago • 91
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published Apr 14 • 101
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 106
AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks Paper • 2604.01487 • Published Apr 1 • 10
SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training Paper • 2603.18079 • Published Mar 18 • 1