Download presentation
Presentation is loading. Please wait.
1
Scanned Documents LBSC 796/INFM 718R Douglas W. Oard Week 8, October 29, 2007
4
Expanding the Search Space Scanned Docs Identity: Harriet “… Later, I learned that John had not heard …”
5
High Payoff Investments Searchable Fraction Transducer Capabilities OCR MT Handwriting Speech
6
The Big Picture Find the words Index the words Do ranked retrieval Use that system to find what you want
7
Some Issues Language-based search without language! –Shape codes Accuracy-selection effect of ranked retrieval –Poor recognition scatters in the query-term space Blind relevance feedback – Based on clean text Image-domain summaries
8
Some Applications Case management for litigation Duplicate detection for declassification productivity and anti-tiling Knowledge management from everything I have ever xeroxed or faxed
9
Some Applications Legacy Tobacco Documents Library –http://legacy.library.ucsf.edu/http://legacy.library.ucsf.edu/ Google Books –http://books.google.com/http://books.google.com/ George Washington’s Papers –http://ciir.cs.umass.edu/irdemo/hw-demo/http://ciir.cs.umass.edu/irdemo/hw-demo/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.