CircleGuardBench Leaderboard

CircleGuardBench is the first-of-its-kind benchmark for evaluating the protection capabilities of large language model (LLM) guard systems.

It tests how well guard models block harmful content, resist jailbreaks, avoid false positives, and operate efficiently in real-time environments on a taxonomy close to real-world data.

Learn more about us at whitecircle.ai

Model
Mode
Access_Type
Integral_Score
Macro_Accuracy
Macro_Recall
Micro_Error
Micro_Avg_time_ms
Total_Count
whitecircle-policy-guard-small
Strict
Open-Source
0.726
0.931
0.930
13.954
2741.800
3920