HanningZhang/physlean_ds_prover_grpo_1e-6_bs256_step90 Text Generation • 7B • Updated about 10 hours ago • 8
HanningZhang/physicslean_kimina_train_gen_from_claude_and_grok_deepseek_one_sample_ep1 Text Generation • 8B • Updated 18 days ago • 13
HanningZhang/physicslean_kimina_train_gen_from_claude_and_grok_deepseek_one_sample_ep2 Text Generation • 8B • Updated 18 days ago • 19
HanningZhang/physicslean_kimina_train_gen_from_claude_and_grok_deepseek_all_ep2 Text Generation • 8B • Updated 18 days ago • 21
HanningZhang/physicslean_kimina_train_gen_from_claude_and_grok_deepseek_all_ep1 Text Generation • 8B • Updated 18 days ago • 15
HanningZhang/physicslean_kimina_train_gen_from_grok_deepseek_one_sample_ep2 Text Generation • 8B • Updated 18 days ago • 15
HanningZhang/physicslean_kimina_train_gen_from_grok_deepseek_one_sample_ep1 Text Generation • 8B • Updated 18 days ago • 17
HanningZhang/physicslean_kimina_train_gen_from_grok_deepseek_all_ep2 Text Generation • 8B • Updated 18 days ago • 14
HanningZhang/physicslean_kimina_train_gen_from_grok_deepseek_all_ep1 Text Generation • 8B • Updated 18 days ago • 17
HanningZhang/physicslean_kimina_claude_kimina_and_deepseek_all_v2_ep1 Text Generation • 8B • Updated 18 days ago • 29
HanningZhang/physicslean_kimina_claude_kimina_and_deepseek_all_v2_ep2 Text Generation • 8B • Updated 18 days ago • 21
HanningZhang/OpenGenAlign-Llama3.1-8B-PPO-Step20-Baseline Text Generation • 8B • Updated Oct 3, 2025 • 8
HanningZhang/OpenGenAlign-Llama3.2-3B-PPO-Step30-Baseline Text Generation • 3B • Updated Oct 3, 2025 • 8
HanningZhang/OpenGenAlign-Qwen2.5-7B-PPO-KL_Cliphigher-Step20 Text Generation • 8B • Updated Oct 3, 2025 • 13
HanningZhang/OpenGenAlign-Qwen2.5-7B-PPO-KL-0002-Step20 Text Generation • 8B • Updated Oct 3, 2025 • 2