Domains
DTap covers 14 high-stakes domains spanning enterprise software, operating systems, finance, healthcare, and more. Each domain ships with policy-aligned benign and malicious tasks, sandboxed environments, and automated judges.
Workflow
Productivity, communication and finance workflow apps.
20 environments
CRM
Salesforce-style customer relationship management.
1 environment
Customer Service
ServiceNow-style customer-support case workflows.
1 environment
Travel
Hotel, flight and rental booking flows.
5 environments
Coding
GitHub, GitLab and terminal-driven engineering tasks.
3 environments
Browser
E-commerce browsing, search and checkout.
1 environment
Research
arXiv-driven literature research and exfil tasks.
1 environment
OS-Filesystem
Shell-driven file-system operations.
1 environment
Windows
Windows desktop GUI agent benchmark.
1 environment
macOS
macOS desktop GUI agent benchmark.
1 environment
Finance
Yahoo Finance, Chase, Robinhood agent flows.
3 environments
Legal
Harvey-style legal review and document drafting.
Telecom
Telecom customer-account workflows.
1 environment
Medical
Hospital client medical-service workflows.
1 environment