ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research
Paper • 2606.07591 • Published • 80
ResearchClawBench: Evaluating AI Agents for Automated Research from Re-Discovery to New-Discovery
Lightweight harness for tool-using LLM agents.
Submit and validate a ResearchClawBench task ZIP