What is a tree really? Patrick Durusau Society of Biblical Literature TEI 2003 Nancy, France.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

Revision and exam preparation. major topic areas XML language –XML structure advantages/ disadvantages applications supports interoperability –DTD structure.
Delivering textual resources. Overview Getting the text ready – decisions & costs Structures for delivery Full text Marked-up Image and text Indexed How.
XML: Extensible Markup Language
ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
Implementing Concurrent Markup in XML Patrick Durusau Society of Biblical Literature Matthew Brook O’Donnell
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Introduction to XML Extensible Markup Language
XML: What, Why, When & How? Hope Greenberg Center for Teaching & Learning June 11 & 18.
Jennifer Widom XML Data DTDs, IDs & IDREFs. Jennifer Widom DTDs, IDs & IDREFs “Well-Formed” XML Adheres to basic structural requirements Single root element.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
XSLT for Data Manipulation By: April Fleming. What We Will Cover The What, Why, When, and How of XSLT What tools you will need to get started A sample.
Introduction to XSLT By Ed Rosenthal And Dave Pion.
XML eXtensible Markup Language w3c standard Why? Store and transport data Easy data exchange Create more languages WSDL (Web Service Description Language)
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
Extensible Markup and Beyond
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
August Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
Programming Project (Last updated: August 31 st /2010) Updates: - All details of project given - Deadline: Part I: September 29 TH 2010 (in class) Part.
Introduction to XML Extensible Markup Language. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information.
On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio,
© Disruptive Innovations Etna a wysiwyg XML RELAXNG- and Gecko-based editor.
Sekimo Solutions mentioned by the TEI  CONCUR: an optional feature of SGML (not XML) that allows multiple.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
SAX. What is SAX SAX 1.0 was released on May 11, SAX is a common, event-based API for parsing XML documents Primarily a Java API but there implementations.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Lexical Analysis Hira Waseem Lecture
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
XML – An Introduction Structured Data Mark-up James McCartney CSCE 590, Cluster and Grid Computing.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
SDPL 2005Notes 2.5: XML Schemas1 2.5 XML Schemas n Short introduction to XML Schema –W3C Recommendation, 1 st Ed. May, 2001; 2 nd Ed. Oct, 2004: »XML Schema.
1 XML An Overview Roger Debreceny University of Hawai`i Skip White University of Delaware XBRL Workshop, August 2006.
Reset: What Happens When We Mess Up Reconcile: Make Things Right with Other People Luke 19:1–10.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
School of Computing and Information Systems CS 371 Web Application Programming XML and JSON Encoding Data.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
Introduction to Markup Languages January 31, 2002.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
Dom and XSLT Dom – document object model DOM – collection of nodes in a tree.
Representing data with XML SE-2030 Dr. Mark L. Hornick 1.
Introduction to DTDs. Introduction We learned how to structure information using XML Learned XML grammar Learned the rules for XML encoding We learned.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. An Overview of XML Ellen Pearlman Eileen Mullin Programming the Web Using.
Optimising XML Schema for IODEF Data model INCH WG, IETF57 July 16, 2003 Yuri Demchenko.
Martin Kruliš by Martin Kruliš (v1.1)1.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
XML Extensible Markup Language
XML 1.Introduction to XML 2.Document Type Definition (DTD) 3.XML Parser 4.Example: CGI Gateway to XML Middleware.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
Metadata Michael J. Watts
Efficient Filtering of XML Documents with XPath Expressions
RE-Tree: An Efficient Index Structure for Regular Expressions
XML in Web Technologies
2/18/2019.
High-Level Programming Language
Presentation transcript:

What is a tree really? Patrick Durusau Society of Biblical Literature TEI 2003 Nancy, France

Descriptive versus Procedural Markup Separation of concerns –How Text is Processed from –How Text is Described Allows decisions about processing to be deferred Added advantage of portability between processing systems Describes the structure of texts

Separation sounds Great! Great Divide Begins! (or does it?) –GML/SGML adopts angle bang syntax for descriptive markup –Encodes the structures in texts –But not how to process or presentation On the other hand: –Instead of traditional presenation –We now have markup trees

Are Markup Trees Presentation? Bear fruit that befits repentance, and do not presume to say to yourselves, ‘We have Abraham as our father’; for I tell you, God is able from these stones to raise up children of Abraham.

Trees as Presentation Bear fruit that befits repentance, and do not presume to say to yourselves, ‘We have Abraham as our father’; for I tell you, God is able from these stones to raise up children of Abraham.

Which Tree to Follow? Traditional XML says either: –text/verse, or –text/sentence But both cannot be present Why? Predetermined that all markup in a file must be recognized as markup and presented as a well-formed tree

Choosing A Tree Recognize all markup –Odd requirement, history of parsing files that are not SGML/XML with selective recognition of markup –Can even selectively recognize SGML/XML markup so long as it is already well formed –Why limit markup options with the recognize all option? –Simplicity of parsing!

Simplicity of Parsing Simplicity harmful to markup! –Well-formedness contrary to: Known features of texts Needs of scholars –Well-formedness may make sense for documents without DTDs or Schemas –But what scholarly encoded document will exist without a DTD or Schema? –Markup limited by ease of parsing?

Simplicity of Parsing II Validating SAX based parsers –Recognize the GI anyway –Order of processing is the problem –Fires on any “<“ –Only to then discover it is not in the DTD or schema –What if the ordering were reversed? –That is: Build the tree to recognize, then parse for markup that matches?

Simplicity of Parsing III But what of the other “markup?” Can you say “string?” If markup recognition is conditional: –Can impose unlimited layers of markup inline on a text –Can search for structures in any tree, and match against strings that are markup in another tree –Divorces markup from a particular presentation

Is Selective Recognition Possible? XPath/XQuery –Efficient Filtering of XML Documents with XPath Expressions, Chee-Yong Chan, Pascal Felber, Minos Garofalakis, Rajeev Rastogi –YFilter: Efficient and Scalable Filtering of XML Documents Yanlei Diao, Peter Fischer, Michael J. Franklin, Raymond To –Efficient Filtering of XML Documents for Selective Dissemination of Information, Mehmet Altınel, Michael J. Franklin

Is Selective Recognition Likely? SC34/WG1 Document Schema and Description Languages (includes, RELAX- NG) Part 1: Overview of ISO/IEC –Path based addressing (role of relationships that are not hierarchical) –JITTs (Just-In-Time-Trees) has been suggested as one approach to consider

Simplistic Markup or Simplistic Parsing The choice is fairly simple: –Simplistic markup, or –Simplistic parsing Latter may have been appropriate, Sun workstations had 128K RAM, 100 MHz processors Laptops now routinely have 1 GB RAM, and over 1 GHz processors

Workarounds or a Solution? All of the current options for overlapping markup compensate for simplistic parsing Parsing research has advanced but markup parsing has remained static Workarounds are not solutions! Our texts need a solution Our users deserve a solution

What Can TEI Do? Develop compelling use cases for overlapping markup Demonstrate the advantages of non- simplistic parsing for markup (sigh, yes the commercial side of things) Press our needs in forums such as SC34 WG3

Conclusion Simplistic parsing will continue so long as no one makes the case for better parsing of markup The “someone” to make the case is the academic markup community Why? We should not dumb down our texts for the convenience of avoiding further development of markup parsers!