view article Article Fixing Gradient Accumulation +4 lysandre, ArthurZ, muellerzr, ydshieh, BenjaminB, pcuenq • Oct 16, 2024 • 66
Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models Paper • 2510.09259 • Published Oct 10, 2025 • 4