My past two column articles on CustomerThink dealt with how to determine the return of agentic investments and whether agentic AI delivers at all.
The question of ability to deliver is particularly interesting for me, as I am researching measurable results other than cost savings in contained business areas for some months now, and regularly find a very strong focus on customer service and marketing, with customer service functions being best able to report measurable results. This is evidenced by the number of success stories I find, supported by the publication of a recent TEI of Zendesk customer service study.
However, most of this is anecdotal evidence, or vendor sponsored/commissioned. And which vendor likes to speak about failures? Similar for buyers who understandably do not like to be in the spotlight with investments that turned out to be less than successful. There hasn’t been too much in depth research on whether generative and/or agentic AI deliver to promise or not.
Luckily, there has been at least some research evaluating the capabilities of LLM based AI agents in business environments published this year. CRMArena-Pro by Salesforce Research naturally has a focus on CRM tasks across B2B and B2C scenarios. The authors identified nineteen tasks commonly executed in CRM systems and categorize these tasks in the four business skill categories database querying and numerical computation, information retrieval and reasoning, workflow execution, and policy compliance and includes a confidentiality awareness evaluation. TheAgentCompany on one hand covers a wider area along the business value chain but on the other hand has a narrower focus on software engineering companies. One other main difference between these two studies is that TheAgentCompany has a focus on more complex tasks that require multiple steps for their execution.
Both studies find that LLM-based agents deliver in simpler contexts while they are still failing in more complex ones.
Not that this comes as a surprise.
While this consistent result gives an indication on what scenarios to avoid, there is little help in what actually to do – besides of starting with simple scenarios and to focus on workflows.
This is where a recent study from the MIT NANDA project gives additional insight. The study, titled “The GenAI Divide – State of AI in Business 2025” starts off with a quite unsettling finding: Only five percent of organizations are extracting value from their integrated AI pilots. The rest – a staggering 95% - doesn’t receive any measurable P&L impact.
Why?
According to the study, the problem is the missing ability of integrated systems to retain feedback, adapt to context, or improve over time.
That’s a big ouch, given that enterprises often invest multi-million dollar budgets into these initiatives.
This is also in stark contrast to what emerged as shadow AI. About 90% of all employees are using privately purchased licenses for ChatGPT, Perplexity, Claude, or other tools
Figure 1: The steep drop from pilots to production for task-specific GenAI tools; source The GenAI Divide
And they are happy with how these consumer-grade tools help them in their work environment. So, apparently, employees are convinced that genAI tools can and do help them in their work environments.
So, the important thing is not to cry out failure but to identify what businesses and vendors can do to make investments into generative and agentic AI a success. The GenAI Divide – State of AI in Business 2025 gives plenty of insight for both.
So, let’s have a look under the hood of this report
What can buyer executives do?
First things first, buy, don’t make. This doubles your chance for success.
Look at the right problem and use the right KPIs. A lot of budget, around 50% of it, is sunk in sales and marketing. While this seems sexy and results can be showed off with easy to gather (and misattribute) KPIs, the real gains seem to lie in back office automation. To mitigate the still existing weaknesses in the automation of complex, start simple. But think big. This is corrobated by the findings of both earlier studies, TheAgentCompany and CRMArena-Pro.
Look at the right partners. Do not only rely on artificial benchmarks but on business outcomes. The world of AI is different from traditional SaaS. AI is service as a software, so consider your vendors outsourcing partners instead of software partners. Have them prove that their AI tools can be configured to your needs – and hold them accountable.
Usability and flexibility are key. One of the core reasons for the success of tools like ChatGPT, Perplexity, Claude or Gemini is the simple user interface plus their ability to be guided through the problem-solving process in conversations. These conversations resemble iterations and are regularly the way humans work. Technologies like RAG and RAC help, but still need improvement. Again, start with the simpler problems and go from there.
Trust your people. Your employees do know what they need. This is abundantly clear through their use of privately sourced AI tools. Do not centralize AI initiatives but have them driven where they matter while providing a meaningful yet robust set of guidelines.
Plus, here’s a bonus tip: All the above is particularly important for enterprises. As opposed to SMBs, enterprises tend to be more vulnerable to internal politics. For the sake of sustaining success, this should be avoided.
What should vendors do?
Vendors should have a close understanding about what their clients are – and have a hard look at what they actually need and want. GenAI and agentic AI are the shiny new tools, still, they need to solve actual business problems. This means that it is important to not build generic tools but solutions that are capable of embedding themselves into business workflows, improve them. This requires a strong focus. Grow from there. If the big, established vendors don’t do that – there is a startup that is capable of disrupting them and eating their lunch. Start with solutions to problems that are not business critical, and extend from there.
One of the major complaints that users have about the tools at hand is that they do not learn and are too rigid. So, genAI tools need to incorporate learning mechanisms that help them to learn via supervised, or unsupervised, means. They also need to provide a context window that is big enough for the complexity of business challenges that are addressed. Lastly, even if a process looks the same, it often isn’t. As a consequence, build in a deep ability for configuration, maybe even customization.
A bonus tip for startups. Many a buyer does not even consider you. They don’t know you, they don’t trust you. Heck, they don’t even know whether you’ll be around tomorrow. So, your best bet to get into larger accounts is by partnering up with vendors or consultants that your buyers know and trust.
Vendor or buyer. Do you need some help? Give me a call. I am there.
Comments
Post a Comment