Salesforce released on Mar. 24 a practitioner’s guide detailing its experience with building and deploying artificial intelligence (AI) agents at scale within the enterprise. The document highlights the challenges, strategies, and results encountered by Salesforce as it implemented AI agents to support real business outcomes.
The report aims to provide insights for organizations seeking to use AI agents effectively, emphasizing that success depends on shifting how these tools are managed and measured. According to Salesforce, most of the industry misunderstands what it takes to deploy functional AI agents in production environments.
The guide explains that traditional software is deterministic—producing consistent outputs from given inputs—while AI agents operate differently by interpreting context and generating varied responses. “Agents are fundamentally different. They have their own reasoning capabilities. They interpret context. They generate responses that vary, and that variation isn’t a bug—it’s the feature,” the paper says.
Salesforce draws parallels between managing employees and managing AI agents, suggesting that clear guidance, monitoring, and calibration improve agent performance over time. The company also addresses common misconceptions about creating universal agents capable of replacing entire human roles: “There’s a persistent myth that you can build one universal agent to replace an entire human role. You can’t. Not today.” Instead, Salesforce recommends breaking down jobs into specific tasks for specialized agent development.
One example cited is Salesforce’s Engagement Agent pilot for its Sales Development Representative (SDR) team, which generated more than $120 million in annualized pipeline during its initial months by focusing on well-defined SDR tasks rather than attempting full job automation: “Our SDR team couldn’t have hit their targets without the Engagement Agent.” The report details how iterative improvements based on top performers’ behaviors led the Engagement Agent to outperform most human sellers in certain areas.
Measurement is emphasized as crucial; agent competency should be evaluated against human benchmarks rather than generic tests unrelated to business needs: “The metric that matters isn’t whether the agent is ‘smart.’ It’s whether the agent can do this specific thing, reliably, at a level that meets or exceeds human performance.” The company also outlines its approach called the Agent Development Lifecycle (ADLC), where autonomy is granted incrementally as competency increases under continuous oversight.
Looking ahead, Salesforce predicts a shift toward predictive competency in which future agents will anticipate business needs proactively: “Unlike today’s agents…the agents of the future will recognize patterns across the enterprise and proactively surface insights…before they become problems.” Further installments of this series will explore technology mentorship for agents (ADLC), workforce transformation as humans move into management roles over teams of specialized agents, and new frameworks for measuring trust and performance.


