On Ground Labs
← Back to Model Training & Efficiency

BodhiLekhan

Why does your OCR forget who's writing?

The Problem

A doctor writes the same prescription shorthand every day. A student fills out answer sheets in the same loopy cursive all semester. A government clerk processes forms in the same tight print, page after page. The handwriting doesn't change. But the OCR system reading it starts from zero every single time.

State-of-the-art handwriting recognition treats every page as if it came from a stranger. Models like TrOCR and PARSeq achieve strong average accuracy across large datasets, but averages hide the damage. Writers whose style drifts from the training distribution — and that's most people — get significantly worse results. The model doesn't learn from seeing them before. It can't.

Recent work has tried to fix this. MetaHTR (CVPR 2021) adapts to individual writers using meta-learning, but requires labeled samples and gradient computation at every inference. MetaWriter (CVPR 2025) improves accuracy further, but still recomputes adaptation from scratch each time the model sees a page. Both methods pay the adaptation cost on every single inference, even when the writer hasn't changed since yesterday.

This is especially wasteful in institutional settings. A school has a known set of students. A hospital has a known set of doctors. These are stable cohorts. Paying the cost of writer adaptation at every inference, for writers you've already seen, is an engineering failure dressed up as a research limitation.

What We're Exploring

We think writer adaptation for known cohorts should be a one-time cost, not a recurring tax.

Current approaches tie personalization to inference. Every time the system reads a page, it re-learns the writer. We're investigating whether writer-specific recognition can be made persistent and reusable — learn a writer once, recognize their handwriting from that point on with no additional cost per page. The same recognition system serving a cohort of hundreds without slowing down for any of them.

The picture we're working toward: a school enrolls its students once. From that point on, handwritten answer sheets are recognized at the accuracy of a personalized model, at the speed of a generic one. A new student joins mid-year? One enrollment session. No retraining, no infrastructure changes.

Getting there raises questions we find genuinely open:

  • Persistence versus drift. Handwriting changes — people age, get injured, switch pens. If you fix a writer's profile at enrollment, when does it go stale? How do you detect that and correct for it without starting over?
  • Cohort scaling. Personalizing for 10 writers is easy. Personalizing for 10,000 is a different problem. How do you manage a large population of writer-specific models without the storage and serving costs becoming impractical?
  • Evaluation beyond averages. Standard HTR benchmarks report corpus-level CER and WER. But the whole point of personalization is per-writer improvement. How do you build an evaluation that measures whether the worst writers got better, not just whether the average moved?
  • We're testing against established baselines on the IAM Handwriting Database — the standard benchmark for writer-level HTR research — with a direct comparison against both generic models and test-time adaptation methods. Same writers, same test sets, same metrics. The variable is whether the system remembers who it's reading.

    Status

    Active Research