arxiv:2605.18643
Kaiyan Zhang
iseesaw
AI & ML interests
Large Reasoning Models, Reinforcement Learning, Agent
Recent Activity
authored a paper about 20 hours ago
Post-Trained MoE Can Skip Half Experts via Self-Distillation upvoted a paper 3 days ago
Post-Trained MoE Can Skip Half Experts via Self-Distillation upvoted a paper 2 months ago
How Far Can Unsupervised RLVR Scale LLM Training?