Demystifying Digital Scholarship 10: TEI

Slides:



Advertisements
Similar presentations
LIS650lecture 1 XHTML 1.0 strict Thomas Krichel
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
XML: Extensible Markup Language
WeB application development
Website Design.
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
 To publish information for global distribution, one needs a universally understood language, a kind of publishing mother tongue that all computers may.
XML Unit 6 October 31. XML, review XML is used to markup data Used to describe information Uses tags like HTML –But all tags are user-defined –Must be.
INF201 Fall2010 Intro. to Info. Technologies Department of Informatics University at Albany – SUNY Original Source: w3schools.com Prepared by Xiao Liang,
XSL Unit 6 November 2. XSL –eXtensible Stylesheet Language –Basically a stylesheet for XML documents XSL has three parts: –XSLT –XPath –XSL-FO.
Advanced Technical Writing 2006 Session #3. Today in Class… ► Teams pitch poster concepts:  Meet with your editorial team, show us how your material.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Pemrograman Berbasis WEB XML part 2 -Aurelio Rahmadian- Sumber: w3cschools.com.
August Chapter 1 - Introduction Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology Radford.
Using Styles and Style Sheets for Design
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
IS432 Semi-Structured Data Lecture 5: XSLT Dr. Gamal Al-Shorbagy.
Introduction technology XSL. 04/11/2005 Script of the presentation Introduction the XSL The XSL standard Tools for edition of codes XSL Necessary resources.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Introduction to XML Eugenia Fernandez IUPUI. What is XML? From the World Wide Web Consortium (W3C) The Extensible Markup Language (XML) is the universal.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
XML Technologies Surekha Akula
Implementing Forms and Form Renderers in the Open Source Portfolio David McPherson, Chris Maurer Will Trillich, Janice Smith Materials by Sean Keesler.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
By: Dan Johnson & Jena Block. RDF definition What is Semantic web? Search Engine Example What is RDF? Triples Vocabularies RDF/XML Why RDF?
ECA 228 Internet/Intranet Design I XSLT Example. ECA 228 Internet/Intranet Design I 2 CSS Limitations cannot modify content cannot insert additional text.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Declaratively Producing Data Mash-ups Sudarshan Murthy 1, David Maier 2 1 Applied Research, Wipro Technologies 2 Department of Computer Science, Portland.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
SCHOOL OF LIBRARY, ARCHIVE AND INFORMATION STUDIES Andy Dawson LIS1510 Library and Archives Automation Issues XML and extensible systems Andy Dawson School.
Tutorial 1 Developing a Basic Web Page. Objectives Learn the history of the Web and HTML Describe HTML standards and specifications Understand HTML elements.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
XML Extensible Markup Language
XML Introduction to XML Extensible Markup Language.
XML Schema – XSLT Week 8 Web site:
1 XSLT XSLT (extensible stylesheet language – transforms ) is another language to process XML documents. Originally intended as a presentation language:
Extensible Markup Language (XML) Pat Morin COMP 2405.
How HTML responsiveness translates to PDF
HTML Structure & syntax
HTML Structure & syntax
Java Web Services Orca Knowledge Center – Web Service key concepts.
Advanced HTML Tags:.
XSLT: The XML transformation language
XML Introduction Bill Jerome.
This is the cover slide..
Introduction to OBIEE:
Do More with Digital Scholarship: XML-Based Metadata
Representation and Markup
Do More with Digital Scholarship: XML-Based Metadata
Data Virtualization Tutorial: XSLT and Streaming Transformations
XML QUESTIONS AND ANSWERS
Towards new approaches to editing of old manuscripts and documents
XML in Web Technologies
Database Processing with XML
Basics of Accessibility in Adobe PDF
GDSS – Digital Signature
Unit A.
Internet Technologies I - Lect.01 - Waleed Ibrahim Osman
Tutorial 7 – Integrating Access With the Web and With Other Programs
CSE591: Data Mining by H. Liu
HTML Structure & syntax
HTML5 and CSS3 Illustrated Unit B: Getting Started with HTML
Unit 6 - XML Transformations
Presentation transcript:

Demystifying Digital Scholarship 10: TEI March 9th, 2017 Matthew Evan Davis Postdoctoral Fellow, Sherman Centre for Digital Scholarship davism17@mcmaster.ca @medievalmatt

What is TEI? TEI stands for the Text Encoding Initiative. It defines itself as “a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics.”

What is TEI Really? TEI is a way to encode both information about a text – metadata – and the content of that text in a way that allows people to do various things with the text despite having only encoded it once.

What is TEI Really? TEI is a way to encode both information about a text – metadata – and the content of that text in a way that allows people to do various things with the text despite having only encoded it once. So what? Why not just put your transcription up as a word file, or pdf, or something and be done with it?

So what? .pdf files, text files, and html, once created, can generally only be used in the original contexts they were created for.

So what? A TEI-encoded text, on the other hand, can be converted to any of the formats mentioned, can be utilized in whole or in part by other projects, and can be searched on across multiple documents easily.

How does TEI do this? TEI is able to do all these things simultaneously because, rather than being tied to a particular piece of software it is actually a XML standard. XML, much like HTML, is built on the concepts of start tags, end tags, and empty elements: <p>The quick brown fox jumped over the lazy dog</p> <p/> The TEI standard can be found at http://www.tei- c.org/Guidelines/P5/.

Terminology The example given in the last slide consists entirely of elements: <p>The quick brown fox jumped over the lazy dog</p> <p/> If you want to add additional information to an XML element that is meant to be machine readable, but not readable by humans, you can include what is called an attribute: <p rend=“align(center) case(allcaps)”>The quick brown fox jumped over the lazy dog</p> The description of each element in the TEI guidelines not only tells you what the intended purpose of the element is, but what attributes can be attached to that element. For the attributes that can attach to the element <p>, browse to http://www.tei-c.org/release/doc/tei- p5-doc/en/html/ref-p.html.

Terminology

How do I actually build a document? This is the smallest TEI document possible: <TEI version="5.0" xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> <fileDesc> <titleStmt> <title>The shortest TEI Document Imaginable</title> </titleStmt> <publicationStmt> <p>First published as part of TEI P2, this is the P5 version using a name space.</p> </publicationStmt> <sourceDesc> <p>No source: this is an original work.</p> </sourceDesc> </fileDesc> </teiHeader> <text> <body> <p>This is about the shortest TEI document imaginable.</p> </body> </text> </TEI> A TEI document consists of two parts – the metadata about the document, which is contained in a tree under the <teiHeader>, and the actual document itself, which is contained either under a <text> or a <sourceDoc> element..

How do I actually build a document? Everything else you do is a process of checking the guidelines, determining what element matches with what you are trying to transcribe, and applying it accordingly. The underlying structure of the TEI Header can be found at http://www.tei- c.org/Vault/P5/3.1.0/doc/tei-p5-doc/en/html/HD.html, while the underlying structure for a default text can be found at http://www.tei- c.org/Vault/P5/3.1.0/doc/tei-p5-doc/en/html/DS.html. Note that the default text structure does not handle every situation that is likely to come up! The Table of Contents for the full Guidelines (at http://www.tei-c.org/Vault/P5/3.1.0/doc/tei-p5-doc/en/html/) can be useful for ferreting out the various ways that your document fits (and doesn’t fit!) TEI as it is written.

Editorial Tools Anything that can edit text can be used to create an XML document. XML-specific tools simply allow you to validate the XML for correctness against the schema, run various programmatic transformations against the document, or autocomplete certain elements. Possible examples (a fuller list comparing features can be found on Wikipedia: https://en.wikipedia.org/wiki/Comparison_of_XML_editors): OxyGen (https://www.oxygenxml.com/): academic license (US$99) Xeditor (http://www.xeditor.com/portal): web-based, pricing by enquiry. Adobe Framemaker (http://www.adobe.com/products/framemaker.html): commerical license (US$999)

Online Tools TAPAS (http://tapasproject.org/): designed to “Visualize, Store, and Share Your TEI.” CWRC-Writer (http://www.cwrc.ca/projects/infrastructure- projects/technical-projects/cwrc-writer/). Based on a modified version of TEI Lite (http://www.tei- c.org/Guidelines/Customization/Lite/) http://apps.testing.cwrc.ca/editor/dev/index.htm Username: cwrc Password: cwrcy

Displaying, Searching and Working With a TEI Text Since TEI is really just a standard under XML, there are two tools that will let you manipulate the document you’ve just created: XSLT and XQuery. XSLT stands for eXtensible Stylesheet Langauge Transformations. It is a language designed to transform XML documents into other formats – other versions of XML, Word Documents, plain text, etc. XQuery is a programming language that can both query and transform collections of data, including TEI documents.

Matthew Evan Davis davism17@mcmaster.ca @medievalmatt Thank you! Matthew Evan Davis davism17@mcmaster.ca @medievalmatt