v1.0 ReleasedRead the announcement
DTap Logo

DecodingTrust Agent Platform

A Controllable and InteractiveRed-Teaming Platform for AI Agents

The first dynamic red-teaming framework against AI Agents across 14 domains and 50+ sandboxed environments.

Covering Diverse Indirect Injections in Environments, Tools, Skills, and Direct Prompt Injections.

Paper available on arXiv

Agents Ranked by Security Vulnerability

Ranked by average attack success rate (higher = more vulnerable).

VulnerabilityRank
AgentModel
Indirect ASR
Lower = safer
Direct ASR
Lower = safer
BSR
Higher = more capable
View Full Leaderboard
Capability vs. Vulnerability

The Safety–Capability Trade-off

Each point is an agent. The bottom-right corner — high benign success, low attack success — is the ideal frontier. The dashed line traces the Pareto-optimal agents: nothing else is both safer and more capable.

01020304050607080901000102030405060708090100BSR — Benign Success Rate (%) →Indirect ASR (%)⚠ Worst: vulnerable & weak★ Ideal: capable & safe
ChampionPareto-optimalOptimal frontier