MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28, 2025 • 63
ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants Paper • 2508.03936 • Published Aug 5, 2025 • 9
ProSec: Fortifying Code LLMs with Proactive Security Alignment Paper • 2411.12882 • Published Nov 19, 2024 • 2