Zikun Li's picture

166 9

Zikun Li

zikun-li

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

upvoted a paper 14 days ago

Self-Distillation Enables Continual Learning

upvoted a paper 14 days ago

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

View all activity

Organizations

None yet

upvoted a paper 1 day ago

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

Paper • 2602.06855 • Published 6 days ago • 65

upvoted 5 papers 14 days ago

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published 16 days ago • 26

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Paper • 2601.19280 • Published 16 days ago • 9

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published 15 days ago • 40

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Paper • 2601.20614 • Published 15 days ago • 118

Advancing Open-source World Models

Paper • 2601.20540 • Published 15 days ago • 126

upvoted 2 papers 20 days ago

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published 25 days ago • 195

InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

Paper • 2601.14209 • Published 23 days ago • 6

upvoted 3 papers 23 days ago

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published about 1 month ago • 151

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published 28 days ago • 30

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published 30 days ago • 39

upvoted 3 papers 26 days ago

Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

Paper • 2601.07641 • Published Jan 12 • 46

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published 29 days ago • 89

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published about 1 month ago • 147

upvoted 2 papers 28 days ago

ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

Paper • 2601.06487 • Published Jan 10 • 52

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published 30 days ago • 62

upvoted 4 papers 30 days ago

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Paper • 2601.07376 • Published Jan 12 • 6

Dr. Zero: Self-Evolving Search Agents without Training Data

Paper • 2601.07055 • Published Jan 11 • 20

GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

Paper • 2601.05110 • Published Jan 8 • 29

MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era

Paper • 2601.07526 • Published Jan 12 • 23