An exercise in preservation and applied technology Making an Electronic Text.

Slides:



Advertisements
Similar presentations
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Advertisements

HTML Lesson 4 Hyper Text Markup Language. Assignment Part 3  Save your last html file as “FirstName3.htm”  Set the title as “FirstName LastName Third.
Alternative FILE formats
INTRO TO THE WWW. What is the World Wide Web? The World Wide Web (WWW) is most often called the Web. The World Wide Web (WWW) is most often called the.
Digitising Collections: Getting Started, Getting Funding, Getting Collections Online Cokie Anderson, Assistant Professor Oklahoma State University
The Web of data with meaning... By Michael Griffiths.
Project 1 Introduction to HTML.
Universal Design, Copyright, and Fair Use E-Reserves: A CSU Success Story Jesse Hausler, Assistive Technology Resource Center, ACCESS Project Cristi MacWaters,
Made by: Dan Ye. Introduction Basic Last Page ☆ HTML stands for Hyper Text Markup Language; ☆ HTML is not a programming language, it is a markup language;
WMES3103 : INFORMATION RETRIEVAL
InfoTrac Power Search 2.0 Lund Online 2009 – Products & Platforms Monique Schutterop.
S OFTWARE AND M ULTIMEDIA Chapter 6 Created by S. Cox.
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
1st Project Introduction to HTML.
Software and Multimedia
HTML-XML Conversion Information presentation is a vital factor to every business, hence our data conversion services can be helpful to any type of business.
1 Lesson 1 Quick HTML Know-How HTML and JavaScript BASICS, 4 th Edition Barksdale / Turner.
HTML 1 Introduction to HTML. 2 Objectives Describe the Internet and its associated key terms Describe the World Wide Web and its associated key terms.
Chapter ONE Introduction to HTML.
Document Delivery Formats for the Web and Legal Digital Collections Kevin Reiss June 18 th, 2004 Law Library Rutgers-Newark School of Law.
Section 9.2 Computer Applications
The Internet. What is the Internet?  The Internet is a network of networks.  It gives users access to a wide variety of information from millions of.
What Is the Internet? The Internet is a worldwide collection of computer networks that links together millions of computers used by businesses, the government,
EAD: A Technical Introduction Julie Hardesty, Metadata Analyst June 3, 2014.
PIZZA WEB PAGE May 28, FOR TODAY  Review Vocabulary Words (take out your worksheets!)  Pizza Web Page  Research more tags  Turn in your homework!
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
XBRL eXtensible Business Reporting Language By: Jeff Elston Jake White and Garrett Allen.
The Internet and the World Wide Web. The Internet A Network is a collection of computers and devices that are connected together. The Internet is a worldwide.
Funded by: © AHDS Oxford Text Archive and good practice in the creation of electronic resources Martin Wynne
Document Retention System. MARCH 2006 Confidential 2 General Architecture Scan and Search Search only Scan and Search Search only Scan Search Store Secured.
XML Extensible Markup Language. What is XML? An infrastructure for describing text and data Developed by W3C(the World Wide Web Consortium)
Introduction to Humanities Computing Spring 1999 Lecture Seven.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
Questys Text & Image Management System Records Management for the Information Age.
CIS 451: Introduction to XML Dr. Ralph D. Westfall October, 2011.
HTML, XHTML, and CSS Sixth Edition Chapter 1 Introduction to HTML, XHTML, and CSS.
HTML and XML Behind Web Authoring Tools. 2 Objectives Introduce HTML Learn HTML Step by step Introduce XML.
HTML CSS JAVASCRIPT. HTML - Stands for Hyper Text Markup Language HTML is a ‘language’ that describes web pages. This language is a collection of codes.
Computing Theory: HTML Year 11. Lesson Objective You will: o Be able to define what HTML is - ALL o Be able to write HTML code to create your own web.
DATA COLLECTION METHODS CONTENT PAGE How data is collected via questionnaires. How data is collected via questionnaires. How data is collected with mark.
Basic HTML PowerPoint How Hyper Text Markup Language Works
HTML Structure & syntax. Introduction This presentation introduces the following: Doctype declaration HTML Tags, Elements and Attributes Sections of a.
1 UNOG Library Digitization and Microform Unit (DMU) – December 2009.
By Blake Stratton. Data Chapter The questionnaire is Printed on paper. People write or tick the boxes. Someone needs to type it in the computer. Some.
Introduction to HTML Year 8. What is HTML O Hyper Text Mark-up Language O The language that all the elements of a web page are written in. O It describes.
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3 rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.
Document Computing Technologies for Managing Electronic Document Collections Ross Wilkinson... [et al.] Circulation Counter [RES3H] ZA4080.D
XML The Extensible Markup Language (XML ), which is comparable to SGML and modeled on it, describes how to describe a collection of data. A standard way.
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
XML. HTML Before you continue you should have a basic understanding of the following: HTML HTML was designed to display data and to focus on how data.
Invitation to Computer Science 6 th Edition Chapter 10 The Tower of Babel.
Introduction to HTML Simple facts yet crucial to beginning of study in fundamentals of web page design!
1 2/16/05CS120 The Information Era Chapter 4 Basic Web Page Construction TOPICS: Intro to HTML and Basic Web Page Design.
CPSC 203 Introduction to Computers Lab 23 By Jie Gao.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
TEI presentation for IS 590 Robert Patrick Waltz July 10 th, 2012.
1 MIT 5316 Web-Based Computing Lecture 1. 2 Welcome Introduction Syllabus.
Part 1 File Formats Definition: A file format is a way to write the code of information for storage of an electronic file on a computer. Different software.
Web Page Programming Terms. Chapter 1 Objectives Describe Internet and Understand Key terms Describe World Wide Web and its Key terms Identify types and.
Project 1 Introduction to HTML.
Project 1 Introduction to HTML.
HTML (Hyper Text Markup Language) HTTP (Hyper Text Transfer Protocol)
Markup Languages Gilok Choi 9/17/2018
eXtensible Markup Language
Software and Multimedia
Software and Multimedia
Introduction to HTML Simple facts yet crucial to beginning of study in fundamentals of web page design!
WEB DESIGNING THROUGH HTML
Class 4: Building Interactive Web Pages
Presentation transcript:

An exercise in preservation and applied technology Making an Electronic Text

Published in 1871 only 456 copies printed This book is a collection of broadsides, ballads, and popular stories in Dickensian London Charles Hindley’s Curiosities of Street Literature

Using High quality scanned images and OCR software we have created text documents from the scanned images Using XML we are then able to “Mark-up” the documents for display on the web. We are following a defined standard for electronic texts. The TEI, or Text Encoding Initiative. What we are doing

This standard was defined by the University of Oxford, Brown University, University of Bergen, and the University of Virginia TEI consortium formulated their guidelines to facilitate interchange between individuals and groups using different programs and computer systems over a broad range of applications Text Encoding Initiative

To make the TEI defined documents as accessible as possible a cross platform mark-up language was chosen A mark-up language can be as simple as HTML (Hyper Text Mark-up Language) As complex as LaTeX As user definable as XML (eXtensible Mark-up Language)

eXtensible Mark-up Language Chosen By TEI for it’s cross platform, multi-application capabilities. The user defines the mark-up in XML custom tag and search XML documents based on those tags XML Why it’s good for you

Each image, scanned saves as a 40 Megabyte uncompressed TIFF Using OCR (optical character recognition) software, we are able to preserve the text. The Images

Once the image has been OCR’ed, a text document is created these text documents can then be marked up in XML Markup can be done is software or manually The Text