Agentic Engineering
Lab results mean nothing if they don't survive deployment.
Most agent research ends at the benchmark. A paper reports state-of-the-art numbers on a controlled task, and the work is considered done. But the hardest problems in building agents aren't about making them smarter in sandboxes. They're about making them survive contact with the real world.
Real enterprise systems are messy. Data is scattered across Slack threads, CRM records, contracts, and billing systems that don't agree with each other. APIs are inconsistent. State is fragmented. Contradictions are everywhere, and no one system has the full picture. Agents that work in these environments need to navigate, discover, reconcile, and recover — not just retrieve and respond.
Our agentic engineering research focuses on the architecture and evaluation of agents that operate over real systems under real constraints. How should agents traverse heterogeneous data sources? How do they detect and resolve contradictions? How do they recover when they fail? These are engineering questions with research-grade answers.
“The test of an agent isn't whether it works in a demo. It's whether it works on a Tuesday afternoon with bad data.”
Projects
Anveshak
Failed coding agent runs contain real diagnostic work that gets thrown away. We build recovery agents that read those traces and finish the job at a fraction of the cost.
DANDI
Enterprise AI agents act on contradictory data across systems and never notice. We build agents that traverse Slack, CRM, and contracts to find and reconcile those conflicts.
SUTRAM
Security logs contain evidence of attacks that no pre-built rule anticipated. We investigate whether an AI system can review yesterday's logs and surface attack chains on its own.
Valmik
Every enterprise AI deployment requires a hand-built ontology that costs months and millions. We build agents that discover a company's conceptual structure directly from its live systems.