What is the real future of conversational AI?
Here are 5 insights from Chatbot Summit 2025 worth knowing.
Apr 9, 2025
What is the real future of conversational AI?
Here are 5 insights from Chatbot Summit 2025 worth knowing.
Apr 9, 2025

Brussels

Early conversational systems relied on branching logic and predetermined responses. They were powerful, but limited in scalability and depth. Today, those frameworks are giving way to intelligent, autonomous agents that plan, adapt, and even create. But this shift isn’t just about efficiency. It’s about empowerment. As summit founder Yoav Barel puts it:
“We want these systems to be empowerable. Not to do more for us, but to let us do more with less.”
For teams like ours at Empathy Lab by EPAM, this evolution reinforces the growing need to connect powerful backend capabilities with thoughtful, user focused design. This will ensure that as agents grow smarter, they also grow more human.
AI supervisors are essential, but incomplete
Niklas Heidloff of IBM dove into one of the critical layers of agentic systems – observability. As AI agents become more autonomous, they still have the tendency to fail, silently. The solution? Integrating AI supervisor agents that oversee agent behavior, identify breakdowns, and maintain alignment with intent and safety protocols.
However, as Niklas pointed out, the current generation of AI supervisors typically only analyze the final output, not the entire request chain. This means they may miss the nuanced ways a response could go wrong, especially if a tool was invoked incorrectly or the reasoning was flawed midway.
Analyzing the entire request chain could significantly improve answer quality and reduce errors, but it also demands far more computational power. This trade-off is one of the key challenges for the coming years. The path forward? Smarter sampling techniques, prioritized observability layers, and lightweight critique models – creative solutions that aim to scale oversight without overwhelming tech budgets.
From red to blue teaming: stress-testing AI for real-world readiness
Alex Combessie from Giskard AI offered a timely reminder: GenAI systems, especially LLM-based agents, are still inherently risky. Hallucinations, prompt injections, and over-reliance from users can lead to real-world legal, ethical, and financial consequences.
Giskard has pioneered continuous red teaming, simulation of attacks to test AI robustness. But Combessie stressed that the future lies in blue teaming, which not only helps detect vulnerabilities but also prevents and heals from them.
“We’re not just trying to find problems anymore”, he explained. “We’re building tools to help developers solve them before they go live”. In this way, blue teaming becomes a proactive, friendly evolution of traditional testing.
Modular AI at scale – the case of Vodafone’s SuperTOBI
Vodafone’s conversational AI journey, led by Thomas Neumann, is a lesson in smart scaling. Their GenAI-powered assistant, SuperTOBI, now handles tens of millions of interactions annually across 24 markets.
Their modular approach uses different models for different use cases, blending lightweight efficiency with large model power only when necessary. And the results speak for themselves: for example, using GenAI to refine intent understanding alone improved automation rates by 12%.
SuperTOBI next frontier is voice – but that comes with added complexity. As Giskard highlighted, voice remains one of the hardest areas to test and secure. Most red teaming today applies only to textual interfaces, leaving a major gap in safety for voice AI systems, a challenge that remains largely unsolved.

Photo credit: Chatbot Summit
Five Key takeaways from the Chatbot Summit 2025
1. Relying on red teaming alone is not enough
Security isn’t just about detection. AI agents must come with embedded, developer-accessible tools for repair and self healing. Expect automated debugging, proactive threat mitigation, and even self reflective systems to become standard.
2. Voice is future focus for many, but comes with many challenges
Voice remains a dominant channel (especially in telecoms), but it lags far behind in safety and oversight. Most observability and red teaming tools are still text-only, leaving voice systems under-tested and vulnerable. Future systems must integrate multimodal red teaming and design for voice parity where possible.
3. AI supervisors need smarter architectures
Current supervisors only analyze outputs. Future AI systems will require lightweight, context-aware critique layers that monitor entire reasoning chains, without overloading GPUs or introducing latency.
4. LLMs need to work in synergy with NLP systems
While LLMs excel at generating fluent and natural responses, NLP systems provide the structured understanding needed to interpret meaning. As Alex Combessie of Giskard AI put it, LLMs are “completion machines”, not “truth machines”. Niklas Heidloff from IBM echoed this, noting that the probabilistic nature of LLMs makes them harder to debug compared to deterministic NLP pipelines. The future isn’t about choosing one over the other as some suggest, it’s about combining both to build agents that understand accurately (NLP) and respond naturally (LLM).
5. The front-end gap is real, and growing
While the summit was largely backend-focused (models, APIs, architecture), there's a growing need for conversational design expertise that balances functionality with usability. Teams like Empathy Lab, who specialize not just in what AI can do, but how it should empathize with users, are uniquely positioned to fill this gap.

Photo credit: Chatbot Summit
The final ingredient is choosing the right partner(s)
As the summit made clear, building truly effective AI agents isn’t just about the right infrastructure, it’s about the right collaborators. From Vodafone’s modular SuperTOBI to Henkel’s customer-first chatbot framework, the most successful implementations were those built in partnership with teams who understood both the tech and the user.
This is where experience design becomes strategy. Knowing which channels to prioritize, which intents to automate, and when to pass control to a human. All of this defines the success of AI in real-world use. And that’s the space Empathy Lab occupies. We help teams not just build smarter agents, but design experiences that are clear, contextual, and meaningful from the first message.
Because in 2025, the question won’t be whether you’re deploying AI agents, chatbots, and assistants, but whether you’re doing it right.

Photo credit: Chatbot Summit

Photo credit: Chatbot Summit