Download presentation
Presentation is loading. Please wait.
Published byClyde Paul Modified over 8 years ago
1
IAEA International Atomic Energy Agency OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud-Pulido @ iaea.org INIS Training Seminar 14-16 November 2011, Vienna, Austria
2
IAEA Some OCR features We can find the needle in the haystack OCR offers a basic search from an unstructured document. OCR brings to life your digitilazed collection. OCR adds an extra value to your image. INIS Training Seminar 14-16 November 2011, Vienna, Austria 2
3
IAEA OCR is a computer technology software that Translate images handwritten or typewritten text into machine-editable text. Translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or Unicode). INIS Training Seminar 14-16 November 2011, Vienna, Austria 3
4
IAEA Scanned Image (paper or micrographic) Vector Image (created from native application) here a raster image for sake of comparison INIS Training Seminar 14-16 November 2011, Vienna, Austria 4
5
IAEA “Do not see the trees (letters) try to see the forest (sentences)“ F0R 488UR1N6 7H3 L0N63V17Y 0F 1NF0RM4710N, P3RH4P8 7H3 M087 1MP0R74N7 R0L3 1N 7H3 0P3R4710N 0F 4 D16174L 4RCH1V3 18 M4N461N6 7H3 1D3N717Y, 1N736R17Y 4ND QU4L17Y 0F 7H3 4RCH1V38 1783LF 48 4 7RU873D 80URC3 0F 7H3 CUL7UR4L R3C0RD. INIS Training Seminar 14-16 November 2011, Vienna, Austria 5
6
IAEA Verdana FOR ASSURING THE LONGEVITY OF INFORMATION, PERHAPS THE MOST IMPORTANT ROLE IN THE OPERATION OF A DIGITAL ARCHIVE IS MANAGING THE IDENTITY, INTEGRITY AND QUALITY OF THE ARCHIVES ITSELF AS A TRUSTED SOURCE OF THE CULTURAL RECORD. INIS Training Seminar 14-16 November 2011, Vienna, Austria 6
7
IAEA Brush Script MT (Windows Font) FOR ASSURING THE LONGEVITY OF INFORMATION, PERHAPS THE MOST IMPORTANT ROLE IN THE OPERATION OF A DIGITAL ARCHIVE IS MANAGING THE IDENTITY, INTEGRITY AND QUALITY OF THE ARCHIVES ITSELF AS A TRUSTED SOURCE OF THE CULTURAL RECORD. INIS Training Seminar 14-16 November 2011, Vienna, Austria 7
8
IAEA PCs ≠ Humans OCR compares patterns and selects closer match, it can be forced to a specific context but requires customization. People adapt to circumstances and can circumvent misspellings if context is clear. INIS Training Seminar 14-16 November 2011, Vienna, Austria 8
9
IAEA True or false Usually, an image is adequately sampled if each letter is at least two pixels in thickness: INIS Training Seminar 14-16 November 2011, Vienna, Austria 9
10
IAEA Zoom in INIS Training Seminar 14-16 November 2011, Vienna, Austria 10
11
IAEA Zoom in INIS Training Seminar 14-16 November 2011, Vienna, Austria 11
12
IAEA Results from OCR It is in this context that I… … and an additional protocol on the basis… INIS Training Seminar 14-16 November 2011, Vienna, Austria 12
13
IAEA Chinese in pixels INIS Training Seminar 14-16 November 2011, Vienna, Austria 13
14
IAEA Chinese vector images from OCR 滤器 INIS Training Seminar 14-16 November 2011, Vienna, Austria 14
15
IAEA Arabic in pixels INIS Training Seminar 14-16 November 2011, Vienna, Austria 15
16
IAEA Arabic vector images from OCR هذ ا وشملت INIS Training Seminar 14-16 November 2011, Vienna, Austria 16
17
IAEA InftyReader - an OCR System for Math Documents INIS Training Seminar 14-16 November 2011, Vienna, Austria 17
18
IAEA Thank you INIS Training Seminar 14-16 November 2011, Vienna, Austria 18
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.