arxiv:2507.08761
JeonghyeKim
beanie00
·
AI & ML interests
None yet
Recent Activity
updated
a model
28 minutes ago
beanie00/Qwen3-8B-Base_sft_v1
published
a model
35 minutes ago
beanie00/Qwen3-8B-Base_sft_v1
authored
a paper
5 months ago
Penalizing Infeasible Actions and Reward Scaling in Reinforcement
Learning with Offline Data
Organizations
None yet