Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using the Gamera framework for the recognition of cultural heritage materials Levy Project II Digital Knowledge Center, Sheridan Libraries, Michael Droettboom,

Similar presentations


Presentation on theme: "Using the Gamera framework for the recognition of cultural heritage materials Levy Project II Digital Knowledge Center, Sheridan Libraries, Michael Droettboom,"— Presentation transcript:

1 Using the Gamera framework for the recognition of cultural heritage materials Levy Project II Digital Knowledge Center, Sheridan Libraries, Michael Droettboom, Ichiro Fujinaga, Karl MacMillan, G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal Anderson

2 Overview Application area: cultural heritage materials The Gamera system Demo

3 Ingestion of cultural heritage materials Digitizing images is a good first step......however, new kinds of inquiry and retrieval require converting the images into a more useful format

4 The problem Commercial OCR systems are often inadequate for non-standard documents The market for specialized recognition of historical documents is very small Researchers performing document recognition often “re-invent” the basic image processing wheel

5 The solution Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications

6 Gamera Allows rapid development of domain-specific document recognition applications Domain experts can customize and control all aspects of the recognition process Includes an easy-to-use interactive environment for experimentation

7 How Gamera is like Matlab Very high-level programming language Close to the problem at hand Supports incremental (test-and-refine) approach to software development User can see the results of the program interactively Provides a large set of pre-designed building blocks

8 How Gamera is unlike Matlab Designed specifically for structured documents Narrower scope makes it simpler to use Open-source and open standards-compliant Gamera scripts are production applications Can be batched, to process a number of images consecutively

9 Tools provided by Gamera Pre-processing Document segmentation and analysis Symbol segmentation and analysis Syntactical or structural analysis Output

10 Pre-processing Photoshop-like filtering of the image Noise removal Blurring De-skewing Binarization

11 Document segmentation and analysis Dividing the document into high-level units

12 Symbol segmentation and classification Dividing the image into individual symbols Using a classifier to label them Classifiers can include k-nearest neighbor, neural-nets, hidden Markov models, etc.

13 Symbol segmentation and classification Original image

14 Symbol segmentation and classification Staff removed image

15 Symbol segmentation and classification Connected component segmentation

16 Symbol segmentation and classification Labeling by a classifier

17 Syntactical or structural analysis Grouping the symbols into meaningful units and performing basic analysis on their relationships In text: lines, paragraphs, columns, etc.

18 Demonstration A whirlwind tour of the Gamera system


Download ppt "Using the Gamera framework for the recognition of cultural heritage materials Levy Project II Digital Knowledge Center, Sheridan Libraries, Michael Droettboom,"

Similar presentations


Ads by Google