Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated about 20 hours ago • 12
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated about 20 hours ago • 12
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated about 20 hours ago • 12
nvidia/Qwen3-Nemotron-32B-GenRM-Principle Text Generation • 33B • Updated Oct 30, 2025 • 776 • 14
view article Article Can Your LLM Think Like a Professional? Introducing ProfBench Oct 28, 2025 • 20
view article Article Can Your LLM Think Like a Professional? Introducing ProfBench Oct 28, 2025 • 20
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published Oct 21, 2025 • 12
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published Oct 21, 2025 • 12
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle Text Generation • 71B • Updated Oct 30, 2025 • 272 • 6
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 8
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 8
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 8 • 2