Install from Source
Clone the DecodingTrust-Agent Platform repository and install it in editable mode. This is the recommended setup for contributors and anyone running evaluations end to end (agents + judges + dataset).
git, Docker (for the per-domain sandbox containers), and an API key for at least one model provider.Step 1 — Clone the repository
git clone https://github.com/AI-secure/DecodingTrust-Agent.git
cd DecodingTrust-AgentStep 2 — Install Python dependencies
The repo ships with both requirements.txt and a pyproject.toml. We recommend a fresh virtual environment, then editable install so changes you make to agent/, dt_arena/, and dt_arms/ are picked up live.
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .Step 3 — Optional framework SDKs
The base install gives you OpenAI Agents SDK, Claude Agent SDK, Google ADK, LangChain, and PocketFlow wrappers. OpenClaw is optional and requires the OpenClaw CLI:
# OpenClaw CLI (only needed for the OpenClaw agent wrapper)
npm install -g openclaw
openclaw models auth paste-token --provider openaiStep 4 — Configure API keys
Create a .env at the repo root with the keys for whichever providers you plan to use:
# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
# Optional: pin a port range for parallel evaluation
DT_PORT_RANGE=10000-12000Step 5 — Sanity check
Run the OpenAI SDK example agent against a single benign CRM task. This pulls one Salesforce sandbox container and exits in ~2 minutes.
python agent/openaisdk/example.py \
--config dataset/crm/benign/1/config.yaml \
--model gpt-4oYou should see a trajectory file written to results/openaisdk/crm/benign/1/<timestamp>.json.