DANDI

Can your AI agent tell when your own systems disagree?

The Problem

Every enterprise runs on contradictions it doesn't know about.

A sales rep promises free migration in Slack. The signed contract says nothing about it. The billing system charges for it anyway. The CRM shows the deal as closed-won with no special terms. Four systems, four versions of the truth, and no one notices until a customer calls angry.

This is already expensive with humans in the loop. It gets dangerous when you hand it to AI.

Enterprise AI agents are shipping now — handling support tickets, closing deals, managing renewals. Every one of them assumes the data it reads is consistent. It isn't. The average enterprise runs on dozens of SaaS applications that were never designed to agree with each other. When an agent pulls context from Slack, the CRM, and the contract store, it treats all three as equally true. It has no mechanism to notice they contradict each other, no way to flag the conflict, and no way to resolve it.

This gives every deployed agent a silent failure mode: it will confidently act on contradictory information and nobody will know until something breaks. As agents move deeper into billing, compliance, and customer-facing decisions, this isn't a quality issue. It's an operational risk — and it scales with every new deployment.

What We're Exploring

Current AI systems retrieve from enterprise data. We think they need to navigate it.

The difference matters. Retrieval pulls chunks that match a query. Navigation means traversing across systems — following a reference from a Slack message to a contract to a billing record — and noticing when the trail contradicts itself. We're designing a standardized, provider-agnostic interface that lets agents treat heterogeneous enterprise systems as a single traversable substrate. The same agent walking Slack-to-HubSpot-to-Drive or Teams-to-Salesforce-to-SharePoint.

The picture we're working toward: you point an agent at a customer account. It walks your Slack, your CRM, your contracts, your billing system. It comes back and says: these three sources disagree about this customer's billing state — here's what each one claims, here's what actually happened, here's the evidence. That reconciliation is durable. The next agent that touches this account doesn't rediscover the same conflict from scratch.

Getting there opens research questions that don't have good answers yet:

Strategic traversal. An agent can't check everything. When the search space spans dozens of services, how does it decide which system to check next? Which trail to follow?

Evaluation without ground truth. When no single source has the complete answer, how do you score whether a contradiction was correctly resolved?

Durability. Should a reconciliation be a one-time answer, or should it persist as a reusable artifact? What happens when the underlying data changes?

We're building SynCorp-2026 to start testing this — an open benchmark using real enterprise services populated with injected contradictions. Not a document-level retrieval test. A test of whether agents can catch the conflicts that live in the gaps between real systems.

Status

Active Research