MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published 9 days ago • 4
CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays Paper • 2602.23276 • Published Feb 26 • 16