Download presentation
Presentation is loading. Please wait.
Published byCameron Lang Modified over 9 years ago
1
An exercise in preservation and applied technology Making an Electronic Text
2
Published in 1871 only 456 copies printed This book is a collection of broadsides, ballads, and popular stories in Dickensian London Charles Hindley’s Curiosities of Street Literature
3
Using High quality scanned images and OCR software we have created text documents from the scanned images Using XML we are then able to “Mark-up” the documents for display on the web. We are following a defined standard for electronic texts. The TEI, or Text Encoding Initiative. What we are doing
4
This standard was defined by the University of Oxford, Brown University, University of Bergen, and the University of Virginia TEI consortium formulated their guidelines to facilitate interchange between individuals and groups using different programs and computer systems over a broad range of applications Text Encoding Initiative
5
To make the TEI defined documents as accessible as possible a cross platform mark-up language was chosen A mark-up language can be as simple as HTML (Hyper Text Mark-up Language) As complex as LaTeX As user definable as XML (eXtensible Mark-up Language)
6
eXtensible Mark-up Language Chosen By TEI for it’s cross platform, multi-application capabilities. The user defines the mark-up in XML custom tag and search XML documents based on those tags XML Why it’s good for you
7
Each image, scanned saves as a 40 Megabyte uncompressed TIFF Using OCR (optical character recognition) software, we are able to preserve the text. The Images
8
Once the image has been OCR’ed, a text document is created these text documents can then be marked up in XML Markup can be done is software or manually The Text
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.