17 Aug 2004
CS 5244: Multimedia
24
/44
Robust Document Understanding
¡
OCR and document understanding are
(currently) fragile technologies
l
Full scan
Þ
OCR
Þ
store pipeline makes
many assumptions
l
What are some?
¡
________________
¡
________________
¡
________________
¡
________________
¡
________________
Scholarly and historical DL
are much harder!