BodhiLekhan
Why does your OCR forget who's writing?
The Problem
A doctor writes the same prescription shorthand every day. A student fills out answer sheets in the same loopy cursive all semester. A government clerk processes forms in the same tight print, page after page. The handwriting doesn't change. But the OCR system reading it starts from zero every single time.
State-of-the-art handwriting recognition treats every page as if it came from a stranger. Models like TrOCR and PARSeq achieve strong average accuracy across large datasets, but averages hide the damage. Writers whose style drifts from the training distribution — and that's most people — get significantly worse results. The model doesn't learn from seeing them before. It can't.
Recent work has tried to fix this. MetaHTR (CVPR 2021) adapts to individual writers using meta-learning, but requires labeled samples and gradient computation at every inference. MetaWriter (CVPR 2025) improves accuracy further, but still recomputes adaptation from scratch each time the model sees a page. Both methods pay the adaptation cost on every single inference, even when the writer hasn't changed since yesterday.
This is especially wasteful in institutional settings. A school has a known set of students. A hospital has a known set of doctors. These are stable cohorts. Paying the cost of writer adaptation at every inference, for writers you've already seen, is an engineering failure dressed up as a research limitation.
What We're Exploring
We think writer adaptation for known cohorts should be a one-time cost, not a recurring tax.
Current approaches tie personalization to inference. Every time the system reads a page, it re-learns the writer. We're investigating whether writer-specific recognition can be made persistent and reusable — learn a writer once, recognize their handwriting from that point on with no additional cost per page. The same recognition system serving a cohort of hundreds without slowing down for any of them.
The picture we're working toward: a school enrolls its students once. From that point on, handwritten answer sheets are recognized at the accuracy of a personalized model, at the speed of a generic one. A new student joins mid-year? One enrollment session. No retraining, no infrastructure changes.
Getting there raises questions we find genuinely open:
We're testing against established baselines on the IAM Handwriting Database — the standard benchmark for writer-level HTR research — with a direct comparison against both generic models and test-time adaptation methods. Same writers, same test sets, same metrics. The variable is whether the system remembers who it's reading.