Install from Source

Clone the DecodingTrust-Agent Platform repository and install it in editable mode. This is the recommended setup for contributors and anyone running evaluations end to end (agents + judges + dataset).

Prerequisites
Python 3.10+, git, Docker (for the per-domain sandbox containers), and an API key for at least one model provider.

Step 1 — Clone the repository

git clone https://github.com/AI-secure/DecodingTrust-Agent.git
cd DecodingTrust-Agent

Step 2 — Install Python dependencies

The repo ships with both requirements.txt and a pyproject.toml. We recommend a fresh virtual environment, then editable install so changes you make to agent/, dt_arena/, and dt_arms/ are picked up live.

python3 -m venv .venv
source .venv/bin/activate

pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Step 3 — Optional framework SDKs

The base install gives you OpenAI Agents SDK, Claude Agent SDK, Google ADK, LangChain, and PocketFlow wrappers. OpenClaw is optional and requires the OpenClaw CLI:

# OpenClaw CLI (only needed for the OpenClaw agent wrapper)
npm install -g openclaw
openclaw models auth paste-token --provider openai

Step 4 — Configure API keys

Create a .env at the repo root with the keys for whichever providers you plan to use:

# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...

# Optional: pin a port range for parallel evaluation
DT_PORT_RANGE=10000-12000

Step 5 — Sanity check

Run the OpenAI SDK example agent against a single benign CRM task. This pulls one Salesforce sandbox container and exits in ~2 minutes.

python agent/openaisdk/example.py \
  --config dataset/crm/benign/1/config.yaml \
  --model gpt-4o

You should see a trajectory file written to results/openaisdk/crm/benign/1/<timestamp>.json.

Next
Continue to Install Environment to bring up the per-domain Docker sandboxes you'll evaluate against, or jump to Eval with decodingtrust-agent to run a full task list.