Model Training & Efficiency
The frontier isn't only about scale.
The dominant narrative in AI is that bigger is better. More parameters, more data, more compute. And it's true — scale works. But it also excludes. A 70-billion parameter model is useless to a teacher in a small-town Indian school. A training run that costs millions is inaccessible to independent researchers. The AI revolution, as currently designed, has a very small guest list.
We think there's serious, publishable, impactful research in the other direction. How small can a model be and still hold a useful conversation? Can you transfer reasoning across languages by swapping embeddings instead of retraining? Can constrained vocabularies and targeted distillation produce specialist models that outperform generalists on narrow tasks at a fraction of the cost?
These aren't compromises. They're research frontiers. Every parameter saved, every training shortcut validated, every deployment made cheaper expands who gets to use AI and who gets to build it.
Projects
BodhiLekhan
Handwriting recognition treats every page as if it came from a stranger, even for writers it has seen before. We investigate persistent writer adaptation that learns once and recognizes at full speed from then on.
Dhaatu
Multilingual AI requires expensive retraining to reason in each new language. We investigate whether reasoning and language grounding are separable, so one core model serves many languages by swapping only embeddings.
Ekdant
Training an AI tutor by rewriting the entire model is expensive and risks degrading the reasoning it teaches. We explore architectures that separate teaching strategy from subject knowledge, so pedagogy trains cheaply and ports across models.
PAATRA
Small language models waste most of their parameters on oversized vocabularies inherited from larger siblings. We investigate right-sizing vocabulary to free capacity for reasoning on bounded-complexity tasks.