Yaqi Duan's picture

2

Yaqi Duan

duanyq

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 months ago

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting

upvoted a paper 12 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

authored a paper 12 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

View all activity

Organizations

None yet

Papers 1

arxiv:2502.04270

models 0

None public yet

datasets 0

None public yet