Off-the-Shelf Agents
Build agents from scratch using our framework wrappers. Each wrapper provides a standardized interface with automatic MCP server management, trajectory tracking, and multi-turn conversation support.
Basic Usage Pattern
All agents follow the same pattern: load configuration, create runtime settings, and run with async context manager.
import asyncio
from dt_arena.src.types.agent import AgentConfig, RuntimeConfig
from agent.openaisdk import OpenAISDKAgent
async def main():
# 1. Load agent configuration from YAML
agent_config = AgentConfig.from_yaml("dataset/crm/benign/1/config.yaml")
# 2. Create runtime configuration
runtime_config = RuntimeConfig(
model="gpt-4o",
temperature=0.1,
max_turns=10,
output_dir="./results"
)
# 3. Create and run agent with async context manager
agent = OpenAISDKAgent(agent_config, runtime_config)
async with agent: # Handles initialize() and cleanup() automatically
result = await agent.run(
"List all leads in the CRM",
metadata={"task_id": "task-001", "domain": "crm"}
)
print(f"Output: {result.final_output}")
print(f"Turns: {result.turn_count}")
print(f"Trace ID: {result.trace_id}")
asyncio.run(main())Configuration File (YAML)
Agent configuration is defined in YAML files. The MCP servers are automatically resolved from the global mcp.yaml registry.
# config.yaml
Task:
task_id: crm-001
domain: crm
task_instruction: |
List all leads and their contact information.
Agent:
name: "CRM_Assistant"
system_prompt: |
You are a helpful CRM assistant with access to Salesforce.
Help users manage their leads, contacts, and accounts.
mcp_servers:
- name: "salesforce"
enabled: true
- name: "gmail"
enabled: true
Runtime: # Optional - can be overridden via CLI or code
model: gpt-4o
temperature: 0.1
max_turns: 10Command Line Usage
Each agent provides an example script that can be run from the command line:
# OpenAI SDK Agent
python agent/openaisdk/example.py \
--config dataset/crm/benign/1/config.yaml \
--model gpt-4o \
--temperature 0.1 \
--max-turns 10 \
--output-dir ./results
# Claude SDK Agent
python agent/claudesdk/example.py \
--config dataset/crm/benign/1/config.yaml \
--model claude-sonnet-4-20250514
# Google ADK Agent
python agent/googleadk/example.py \
--config dataset/crm/benign/1/config.yaml \
--model gemini-2.0-flash
# LangChain Agent
python agent/langchain/example.py \
--config dataset/crm/benign/1/config.yaml \
--model gpt-4o
# PocketFlow Agent
python agent/pocketflow/example.py \
--config dataset/crm/benign/1/config.yamlOpenAI Agents SDK
Uses the official OpenAI Python SDK with built-in tracing support.
from agent.openaisdk import OpenAISDKAgent
from dt_arena.src.types.agent import AgentConfig, RuntimeConfig
agent_config = AgentConfig.from_yaml("config.yaml")
runtime_config = RuntimeConfig(
model="gpt-4o",
temperature=0.1,
max_turns=200,
output_dir="./results"
)
agent = OpenAISDKAgent(agent_config, runtime_config)
async with agent:
result = await agent.run(
"List all leads in the CRM",
metadata={"task_id": "test-001", "domain": "crm"}
)
print(result.final_output)Claude SDK (Anthropic)
Uses Anthropic's Claude API with tool use capabilities.
from agent.claudesdk import ClaudeSDKAgent
from dt_arena.src.types.agent import AgentConfig, RuntimeConfig
agent_config = AgentConfig.from_yaml("config.yaml")
runtime_config = RuntimeConfig(
model="claude-sonnet-4-20250514",
temperature=0.1,
max_turns=100,
output_dir="./results"
)
agent = ClaudeSDKAgent(agent_config, runtime_config)
async with agent:
result = await agent.run(
"Schedule a meeting for next Tuesday",
metadata={"task_id": "test-002", "domain": "workflow"}
)Google ADK (Gemini)
Uses Google's Agent Development Kit with LlmAgent and Runner pattern.
from agent.googleadk import GoogleADKAgent
from dt_arena.src.types.agent import AgentConfig, RuntimeConfig
agent_config = AgentConfig.from_yaml("config.yaml")
runtime_config = RuntimeConfig(
model="gemini-2.0-flash",
temperature=0.1,
max_turns=150,
output_dir="./results"
)
agent = GoogleADKAgent(agent_config, runtime_config)
async with agent:
result = await agent.run(
"Find all contacts from Acme Corp",
metadata={"task_id": "test-003", "domain": "crm"}
)LangChain
Uses LangChain with FastMCP integration. Auto-detects provider from model name.
from agent.langchain import LangChainAgent
from dt_arena.src.types.agent import AgentConfig, RuntimeConfig
agent_config = AgentConfig.from_yaml("config.yaml")
runtime_config = RuntimeConfig(
model="gpt-4o", # Also supports: claude-*, gemini-*
temperature=0.1,
max_turns=100,
output_dir="./results"
)
agent = LangChainAgent(agent_config, runtime_config)
async with agent:
result = await agent.run(
"Draft an email to the marketing team",
metadata={"task_id": "test-004", "domain": "workflow"}
)PocketFlow
Uses PocketFlow with ReAct (Reasoning + Acting) pattern for graph-based execution.
from agent.pocketflow import MCPReactAgent
from dt_arena.src.types.agent import AgentConfig, RuntimeConfig
agent_config = AgentConfig.from_yaml("config.yaml")
runtime_config = RuntimeConfig(
model="gpt-4o",
temperature=0.1,
max_turns=100,
output_dir="./results"
)
agent = MCPReactAgent(agent_config, runtime_config)
async with agent:
result = await agent.run(
"Create a new lead named John Smith",
metadata={"task_id": "test-005", "domain": "crm"}
)Multi-turn Conversations
All agents support multi-turn conversations. You can either make sequential calls or pass a list of queries:
# Method 1: Sequential calls (agent remembers context)
async with agent:
result1 = await agent.run("List all leads in my account")
result2 = await agent.run("How many leads are there total?")
result3 = await agent.run("Create a new lead named Test User")
# Reset conversation for fresh start
agent.reset_conversation()
# Method 2: Pass list of queries
async with agent:
queries = [
"List all leads in my account.",
"How many leads are there total?",
"Create a new lead named Test User."
]
result = await agent.run(queries, metadata={"task_id": "multi-turn-001"})Framework Reference
| Framework | Agent Class | Default Model |
|---|---|---|
| OpenAI Agents SDK | OpenAISDKAgent | gpt-4o |
| Claude SDK | ClaudeSDKAgent | claude-sonnet-4-20250514 |
| Google ADK | GoogleADKAgent | gemini-2.0-flash |
| LangChain | LangChainAgent | gpt-4o |
| PocketFlow | MCPReactAgent | gpt-4o |