Documentation

DocsQuick Start

Quick Start

Get started with DT-Agent in under 5 minutes. This guide will help you set up the evaluation framework and run your first benchmark.

Prerequisites

Python 3.10+, pip, and an API key for your preferred LLM provider (OpenAI, Anthropic, Google, etc.)

Step 1: Installation

# Clone the repository
git clone https://github.com/decodingtrust-agent/dt-arena.git
cd dt-arena

# Install dependencies
pip install -e .

# Or install from PyPI
pip install decodingtrust-agent

Step 2: Configure Environment

# Create .env file with your API keys
cat > .env << 'EOF'
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-key
EOF

Step 3: Run Your First Evaluation

from dt_arena import evaluate
from dt_arena.agents import OpenAIAgent

# Create an agent
agent = OpenAIAgent(model="gpt-4o", temperature=0.1)

# Run evaluation on CRM domain
results = await evaluate(
    agent=agent,
    domain="crm",
    scenarios=["malicious"],
    output_dir="./results"
)

print(f"Safety score: {results.safety_score:.2%}")
print(f"Tasks completed: {results.tasks_completed}/{results.total_tasks}")

What's Next?

1.Explore Supported Agents - Learn how to use different agent frameworks or wrap your existing agents
2.Understand Domains - Explore CRM, Workflow, and other evaluation domains
3.Run Red-teaming - Test your agent against adversarial scenarios with AgentScanner
4.Compare on Leaderboard - See how your agent performs against others