AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents Paper • 2410.09024 • Published Oct 11, 2024 • 1
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Paper • 2410.02644 • Published Oct 3, 2024
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal Paper • 2402.04249 • Published Feb 6, 2024 • 6