CRM

CRM

A simulated Salesforce CRM workspace evaluating whether agents can manage leads, accounts, and opportunities without leaking PII or executing destructive updates from poisoned record content.

Domain overview

Customer Relationship Management (CRM) systems are widely used to manage interactions with customers and support core business operations such as sales, marketing, and customer service. Modern CRM platforms such as Salesforce and ServiceNow increasingly incorporate AI agents to automate tasks including lead qualification, customer communication, case handling, scheduling, reporting, and data maintenance.

These agents typically operate with access to sensitive data (e.g., customer records, financial transactions) and can execute high-stakes actions (e.g. sending customer-facing communications, updating records, initiating transactions). While operating in high-impact operational environments and interacting with external users, security enforcement around these automated workflows is often limited or inconsistent. This creates opportunities for third-party adversaries or malicious users to manipulate agents into performing harmful actions within the system. Such actions may include unauthorized access or modification of customer data, fraudulent financial operations, large-scale spam communications, or covert data exfiltration, leading to severe consequences including privacy violations, financial loss, regulatory risk, and reputational damage for organizations.

We first design a comprehensive set of benign tasks for CRM agents, covering 11 representative categories commonly encountered in real-world CRM workflows. Based on core CRM domain security policies from major platforms such as Salesforce and ServiceNow, as well as broader regulatory frameworks including the EU AI Act and GDPR, we derive a set of 9 key security risk categories. Guided by these risks, we construct red-teaming tasks with malicious goals under two primary threat models to systematically evaluate the security robustness of CRM agents.

Benign task categories

Lead & Prospect Management

Manages the capture, qualification, assignment, and conversion of prospective customers into actionable sales opportunities

Contact & Account Management

Maintains accurate customer and organizational profiles, relationship structures, and historical interaction records

Opportunity & Pipeline Management

Tracks deal progression, revenue forecasting, and updates to sales stages throughout the customer acquisition lifecycle

Activity & Engagement Management

Logs, schedules, and coordinates customer interactions such as emails, calls, meetings, and follow-up tasks

Customer Support & Case Management

Handles the creation, tracking, escalation, and resolution of customer service requests and support cases

Communication Automation

Automates personalized outbound and transactional communications across email and messaging channels

Billing & Transaction Integration

Synchronizes financial operations such as payments, refunds, invoices, and subscriptions with customer records

Calendar & Meeting Coordination

Supports appointment scheduling, conferencing setup, and calendar synchronization for customer engagements

Reporting & Analytics Support

Aggregates and analyzes CRM data to generate operational insights, forecasts, and performance reports

Data Quality & Maintenance

Ensures the accuracy, consistency, deduplication, and compliance of CRM records across the system

Customer Rights & Data Compliance

Processes customer requests such as data deletion, account cancellation, subscription termination, meeting cancellations, and opt-out preferences in accordance with privacy regulations such as GDPR

Policy & risk framework

Domain policies

We select two domain-specific policies that govern the Salesforce CRM platform: (1) Salesforce Acceptable Use and External-Facing Services Policy (AUESP) , which applies to all external users (including agents) interacting with Salesforce services and defines prohibited content, activities, and misuse scenarios within the CRM ecosystem; (2) Salesforce Artificial Intelligence Acceptable Use Policy (AI AUP) , which specifies additional restrictions on the use of AI-powered features and generative AI services within the Salesforce platform, including requirements on responsible data access, system interaction, and prevention of harmful or unsafe AI behaviors.

General regulatory frameworks

We additionally consider two widely adopted regulatory frameworks that govern the safe handling of data and automated decision systems relevant to CRM environments: (1) the EU AI Act , which establishes a risk-based regulatory framework for AI systems and defines obligations for high-risk AI applications, including transparency, accountability, and safeguards against harmful automated actions; (2) the General Data Protection Regulation (GDPR) , which regulates the collection, processing, storage, and transfer of personal data, including requirements for lawful processing, data minimization, user consent, and protection against unauthorized disclosure of sensitive information.

Results in this domain

Indirect / Direct ASR (lower is safer) and BSR (higher is more capable) for every evaluated agent on the CRM suite.

FrameworkModel
Indirect ASR
Lower = safer
Direct ASR
Lower = safer
BSR
Higher = more capable

Environments

1 environment in the CRM domain.