ChatSee raises $6.5M to build ‘failure memory’ for enterprise AI agents
ChatSee.AI Inc. has raised $6.5 million in seed funding to develop a failure intelligence layer for autonomous AI systems. The round was led by True Ventures. The company aims to observe AI agent failures, preserve context, and create a knowledge base to prevent recurrence.
ChatSee.AI Inc., a company that provides a failure intelligence layer for autonomous artificial intelligence systems, has raised $6.5 million in seed funding.
True Ventures led the round, announced Thursday, with participation from First Rays Venture Partners, Seven Hills Ventures and other industry veterans. “Whether they like it or not, AI is already in the enterprise,” co-founder and Chief Executive Sekhar Sarukkai told SiliconANGLE in an interview.
AI agents are already arriving at the doorsteps of enterprise teams through Microsoft Corp. Copilot, Databricks Inc. Genie, Snowflake Inc., Workday Inc., OpenAI Group PBC, Anthropic PBC and internal builds. This doesn’t even begin to reflect the growing ecosystem of open-source projects including OpenClaw, NemoClaw, Hermes and others. Agents are here and they represent an operational reality.
As enterprises move agents from pilots into production, the challenge of governing and controlling them is shifting from whether they can build and test them in simulation to whether they can trust them with real customers and employee work.
“They all realize that it’s a nondeterministic infrastructure, and they cannot test their way out of failures,” Sarukkai added.
Sarukkai said ChatSee is entering the industry to handle the “confidence gap” with what it describes as a failure intelligence layer – a model that is designed to observe when agents fail and preserve the surrounding context, capture how problems were fixed and feed the knowledge back so that future agent actions can avoid that failure.
Beyond observability, the vision is to provide self-learning and adaptivity at scale.
Under the hood, ChatSee uses a taxonomy built on collecting more than 10,000 grounded examples of enterprise agent failures and classifying them into 157 categories. These include tool-call failures and failures across phases such as scoping, reasoning and execution. Categories shift the scope of observation and failure correction from monitoring for the industry’s first failure mode – hallucinations – to a broader set of equally subtle issues.
Where the agentic rubber meets the road
Over the past few years, business teams have moved from using AI to power chatbots to driving fully autonomous agents that take actions on their own, break down tasks and handle long-horizon activities. Many of them are now being incorporated directly into core operations, where subtle problems are not immediately visible; a minor misalignment at scale can become a major problem if left unattended.
“These are not classic conversational support kind of agents,” said Sarukkai. “These are really supporting core business.”
In many use cases, AI agents are being deployed in e-commerce and financial services, providing decision-making capabilities such as catalog validation, pricing, transaction labeling and merchant code classification. What happens when an agent is subtly wrong about a merchant code, and this propagates? When a human catches and corrects the problem, that correction needs to also propagate to every agent working across the system.
“Think of it as a failure knowledge base … which the agent can be configured at the platform level to reference,” Sarukkai said.
This means that if any one agent in the system runs into a problem, gets corrected by a human, fails a tool call repeatedly, changes behavior so API calls begin breaking or so forth, it self-corrects. If these corrections are either critical or become a trend, they are written to a central authority that other agents can check up on to become best practices in the future.
“Intelligence is not lost,” Sarukkai explained the vision. “We continue to build this failure intelligence, both from humans as well as our own judgment.”
The fundamental proposition of ChatSee is that enterprises are building and deploying more AI agents. The tooling layer around them is still catching up. Startups such as Voker are building platforms to understand how agents perform in the wild, while Respan focuses on proactive observability and root-cause analysis across agent trials. Similarly, Monte Carlo Data Inc.’s AI observability launch shows that data observability vendors are extending into AI inputs, outputs and quality monitoring.
“Many of the most significant AI risks emerge at runtime as agents operate autonomously,” research and advisory firm TAG-infosphere Inc. CEO Dr. Eduard Amoroso said. “Because these systems are probabilistic and adaptive, static testing alone is insufficient. This is driving the need for continuous runtime assurance across enterprise workflows.”
The company sees the trend as building towards observability, which tells teams what happened, evaluation tells teams if the agent performed well. The company wants to become a memory layer for what failed, why it failed and how to prevent recurrence.
The industry is trending toward self-learning and self-healing agents, and as more agents cooperate, operate in swarms and work alongside humans, there will be more opportunities for rich capabilities that let agents collaborate to do work and avoid past mistakes.
Image: SiliconANGLE/Microsoft Designer
A message from John Furrier, co-founder of SiliconANGLE:
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media