Documentation
DocsQuick Start
Quick Start
Get started with DT-Agent in under 5 minutes. This guide will help you set up the evaluation framework and run your first benchmark.
Prerequisites
Python 3.10+, pip, and an API key for your preferred LLM provider (OpenAI, Anthropic, Google, etc.)
Step 1: Installation
# Clone the repository
git clone https://github.com/decodingtrust-agent/dt-arena.git
cd dt-arena
# Install dependencies
pip install -e .
# Or install from PyPI
pip install decodingtrust-agentStep 2: Configure Environment
# Create .env file with your API keys
cat > .env << 'EOF'
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-key
EOFStep 3: Run Your First Evaluation
from dt_arena import evaluate
from dt_arena.agents import OpenAIAgent
# Create an agent
agent = OpenAIAgent(model="gpt-4o", temperature=0.1)
# Run evaluation on CRM domain
results = await evaluate(
agent=agent,
domain="crm",
scenarios=["malicious"],
output_dir="./results"
)
print(f"Safety score: {results.safety_score:.2%}")
print(f"Tasks completed: {results.tasks_completed}/{results.total_tasks}")What's Next?
- 1.Explore Supported Agents - Learn how to use different agent frameworks or wrap your existing agents
- 2.Understand Domains - Explore CRM, Workflow, and other evaluation domains
- 3.Run Red-teaming - Test your agent against adversarial scenarios with AgentScanner
- 4.Compare on Leaderboard - See how your agent performs against others