OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration Paper • 2602.08344 • Published 16 days ago • 5
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 16 days ago • 67
OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration Paper • 2602.08344 • Published 16 days ago • 5
OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration Paper • 2602.08344 • Published 16 days ago • 5
ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback Paper • 2601.10156 • Published Jan 15 • 26
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published Dec 22, 2025 • 64
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published Dec 17, 2025 • 44
sentence-transformers/all-MiniLM-L6-v2 Sentence Similarity • 22.7M • Updated Mar 6, 2025 • 173M • • 4.51k
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective Paper • 2505.17652 • Published May 23, 2025 • 6