Implementing Concurrent Markup in XML Patrick Durusau Society of Biblical Literature Matthew Brook O’Donnell

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML: Extensible Markup Language
XSL XSLT and XPath 11-Apr-17.
Stand-off Annotation Further details and examples: Durusau and O’Donnell’s (2001) powerpoint presentation Thompson and McKelvie’s (1997) “Hyperlink semantics.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Introduction to XML Rashmi Kukanur. XML XML stands for Extensible Markup Language XML was designed to carry data XML and HTML designed with different.
Introduction to XML This material is based heavily on the tutorial by the same name at
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
4/20/2017.
Chapter 12 Creating and Using XML Documents HTML5 AND CSS Seventh Edition.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
10/06/041 XSLT: crash course or Programming Language Design Principle XSLT-intro.ppt 10, Jun, 2004.
Pemrograman Berbasis WEB XML part 2 -Aurelio Rahmadian- Sumber: w3cschools.com.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
EAD: A Technical Introduction Julie Hardesty, Metadata Analyst June 3, 2014.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Extensible Markup and Beyond
CISC 3140 (CIS 20.2) Design & Implementation of Software Application II Instructor : M. Meyer Address: Course Page:
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Sekimo Solutions mentioned by the TEI  CONCUR: an optional feature of SGML (not XML) that allows multiple.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML About XML Things to be known Related Technologies XML DOC Structure Exploring XML.
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
ECA 228 Internet/Intranet Design I XSLT Example. ECA 228 Internet/Intranet Design I 2 CSS Limitations cannot modify content cannot insert additional text.
Lecture 11 XSL Transformations (part 1: Introduction)
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
1 XML An Overview Roger Debreceny University of Hawai`i Skip White University of Delaware XBRL Workshop, August 2006.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
WEB APPLICATION DEVELOPMENT For More visit:
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
Jennifer Widom XML Data Introduction, Well-formed XML.
XP New Perspectives on XML, 2 nd Edition Tutorial 8 1 TUTORIAL 8 CREATING ELEMENT GROUPS.
Submitted To: Ms. Poonam Saini, Asst. Prof., NITTTR Submitted By: Rohit Handa ME (Modular) CSE 2011 Batch.
Problems with XML & XML Schemas XML falls apart on the Scalability design goal. 1.The order in which elements appear in an XML document is significant.
SDPL 2002Notes 4: Intro to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –Programmatic manipulation of documents n Now a more human-oriented.
Representing data with XML SE-2030 Dr. Mark L. Hornick 1.
Introduction to XML XML – Extensible Markup Language.
University of Nottingham School of Computer Science & Information Technology Introduction to XML 2. XSLT Tim Brailsford.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Basic HTML Document Structure. Slide 2 Goals (XHTML HTML5) XHTML Separate document structure and content from document formatting HTML 5 Create a formal.
Lecture 23 XQuery 1.0 and XPath 2.0 Data Model. 2 Example 31.7 – User-Defined Function Function to return staff at a given branch. DEFINE FUNCTION staffAtBranch($bNo)
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
What is a tree really? Patrick Durusau Society of Biblical Literature TEI 2003 Nancy, France.
CH 15 XSL Transformations 1. Objective What is XSL? Overview of XSL transformations Understanding XSL templates Computing the value of a node with xsl:value-of.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML QUESTIONS AND ANSWERS
XML Data Introduction, Well-formed XML.
New Perspectives on XML
Presentation transcript:

Implementing Concurrent Markup in XML Patrick Durusau Society of Biblical Literature Matthew Brook O’Donnell OpenText.org and University of Surrey Roehampton

Why Concurrent Hierarchies? Structures that do not “properly” nest in the XML sense Complex textual traditions with multiple witnesses and variants Different Interpretations of Text Recording physical layout of text and other analysis

Overlapping Example Matthew 3:8 Bear fruit that befits repentance, Matthew 3:9 and to not presume to say to yourselves, ‘We have Abraham as our father’; for I tell you, God is able from these stones to raise up children of Abraham.

Matthew 3:8-9 –First Choice Bear fruit that befits repentance, and to not presume to say to yourselves, ‘We have Abraham as our father’; for I tell you, God is able from these stones to raise up children of Abraham.

Matthew 3:8-9 – Second Choice Bear fruit that befits repentance, and to not presume to say to yourselves, ‘We have Abraham as our father’; for I tell you, God is able from these stones to raise up children of Abraham.

Matthew 3:8-9 – Verboten! Bear fruit that befits repentance, and to not presume to say to yourselves, ‘We have Abraham as our father’; for I tell you, God is able from these stones to raise up children of Abraham.

Design Principles: Part 1 Formal simplicity Capacity to represent all occurring or imaginable kinds of structures Suitability for formal or mechanical validation Clear identity with the notations needed for simpler cases Allow for conditional indexing and processing

Design Principles: Part 2 Allow for extraction of well-formed subtrees and documents Allow for query of the position of the element between two or more hierarchies Use standard XML syntax and mechanisms Validation and processing must be possible with standard XML software Can be used with existing documents encoded in XML markup

Bottom Up Virtual Hierarchies Membership of PCDATA in a particular hierarchy Record that information using XPath syntax Gather information from multiple document instances into a base file Query membership in and across hierarchies with BUVH

A Simple Example (1) This is text a a texs A 1 in a b base file b an C 2 Four separate (overlapping) hierarchies

This is text 1 in a base file 2 1. Page view pages lines This is text in a base file A Simple Example (2)

This is text 1 in a base file 2 2. Text view A Simple Example (3) paragraph This is text in a base file

This is text in a base file Linguistic view A Simple Example (4) CLAUSE complement predicate subjectadjunct This is text in a base file

This is text in a base file Textual variant view A Simple Example (5) texs in MSS A an in MSS C <rdg xlink:href="base.xml#id(w3)" wit="A" val="texs"/> <rdg xlink:href="base.xml#id(w5)" wit="C" val="an"/> (using out-of-line markup)

A Simple Example (6) To encode these hierarchies in a single file, one view must be selected as base hierarchy This is text in a base file This is text in a base file Non-unique IDs The other hierarchies must be inserted into this base hierarchy in a way that avoids overlapping elements This is text in a base file Inconsistent nesting Loss of parent- child relationship

A Simple Example (7) BUVH Approach 1. Create common base file with divisions for Atomic PCDATA This is text in a base file (here word divisions)

A Simple Example (7) BUVH Approach 1. Create common base file with divisions for Atomic PCDATA (here word divisions) 2. For each Atomic PCDATA element: a. Locate in each hierarchy b. Construct XML Membership XPath Expression describing its position within the hierarchy c. Add Tree Structure Position Attribute for element’s position in hierarchy to element in base file

A Simple Example (7) BUVH Approach This is text in a base file 1. Page hierarchy: /pages This is text in a base file This is text in a base file This is text in a base file This is text in a base file This is... This is text in a base file 2. Text hierarchy: 3. Linguistic hierarchy: /clauses This is text in a base file This is text in a base file This is text in a base file This is text in a base file This is text in a base file This is text in a base file This is text in a base file This is text in a base file <w id= " w1 ” This <w id= " w1 " This <w id= " w1 " This a. Locate in hierarchyb. Construct XPath expressionc. Add TSP Attributea. Locate in hierarchyb. Construct XPath expressionc. Add TSP Attributea. Locate in hierarchyb. Construct XPath expressionc. Add TSP Attribute

A Simple Example (8) <baseFile xmlns:sn="urn:clause" xmlns:tx="urn:text" xmlns:pg="urn:pages" xmlns:vr="urn:variants"> <wid="w1" > This <w id="w2" > is <w id="w3" > text

A Simple Example (8) <w id="w4" > in <w id="w5" > a <w id="w6" > base <w id="w7" > file

A Simple Example (9) Queries across different hierarchies can be carried out using XPath expressions, e.g. using XSLT Example 1: Locate words that have textual variants and are found on page 2 XPath query: Every element in base file that takes part in ‘variants’ hierarchy (i.e. It has a textual variant) And has a pg:pages attribute that contains the string ‘p2’, i.e. an id for the second page Result: <w id="w5" > a

A Simple Example (9) Queries across different hierarchies can be carried out using XPath expressions, e.g. using XSLT Example 2: Locate words in the first clause that do not occur on the first line of their page XPath query: Every element in base file that is a child of a element that is the first child of the element And has a pg:pages attribute that does not contain the string ‘line[1]’, i.e. not the first child Result: <w id="w3" > text <w id="w7" > file

Summary: BUVH Approach Authoring of XML occurs within a single hierarchy (any XML editor) Automatic construction of base file with any XSLT processor Query with any XSLT processor Unlimited hierarchies

Future Plans Development of XSLT Extensions to process BUVH Base File Base file format (possible use of Xalan’s DTM format?) Testing of BUVH against more complex examples Use of XLink with BUVH for read-only or large corpora

Markup is metadata about #PCDATA Markup is metadata about #PCDATA</partingThought>