XML Robert Grimm New York University. The Whirlwind So Far  HTTP  Persistent connections  (Style sheets)  Fast servers  Event driven architectures.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
XML: text format Dr Andy Evans. Text-based data formats As data space has become cheaper, people have moved away from binary data formats. Text easier.
An Introduction to XML Based on the W3C XML Recommendations.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
Extensible Markup Language Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
A Technical Introduction to XML Transparency No. 1 XML quick References.
Java API for XML Processing (JAXP) CSE 4/586: Distributed Systems Department of Computer Science and Engineering University at Buffalo, New York Jia Zhao.
Thayer School of Engineering Dartmouth Lecture 2 Overview Web Services concept XML introduction Visual Studio.net.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
26-Jun-15 XML. 2 HTML and XML, I XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users XML is used to.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
COS 381 Day 14. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
Cornell CS 502 Markup Languages SGML, HTML, XML, XHTML CS 502 – Carl Lagoze – Cornell University.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
Topics The "bigger picture" –The "XML sales pitch" –XML/XHTML vs. SGML/HTML –XML in electronic publishing –XML and the future, web 2.0 XML basics: –Building.
17 Apr 2002 XML Schema Andy Clark. What is it? A grammar definition language – Like DTDs but better Uses XML syntax – Defined by W3C Primary features.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Scientific Markup Languages Birds of a Feather A 10-Minute Introduction to XML Timothy W. Cole Mathematics Librarian & Professor of.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a family of languages ● semantic/structured mark-up language.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
XML Syntax - Writing XML and Designing DTD's
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
Session IV Chapter 9 – XML Schemas
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
Beginning XML 4th Edition. Chapter 12: Simple API for XML (SAX)
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
17 Apr 2002 XML Syntax: Documents Andy Clark. Basic Document Structure Element tags – Elements have associated attributes Text content Miscellaneous –
Lecture 16 Introduction to XML Boriana Koleva Room: C54
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,
1 Indexing The syntax for creating a index is: CREATE [UNIQUE] INDEX index_name ON table_name (column1, column2,... column_n) [ COMPUTE STATISTICS ]; Why.
XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Well Formed XML The basics. A Simple XML Document Smith Alice.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Lecture 23 XQuery 1.0 and XPath 2.0 Data Model. 2 Example 31.7 – User-Defined Function Function to return staff at a given branch. DEFINE FUNCTION staffAtBranch($bNo)
2/25/2016XML1 XML for Beginners Sridevi 1. Basic XML Concepts 2. Defining XML Data Formats 3. Querying XML Data.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML – Basic Concepts (modified version from Dr. Praveen Madiraju) 2015, Fall Pusan National University Ki-Joune Li.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
1 XML eXtensible Markup Language. 2 Introduction and Motivation Dr. Praveen Madiraju Modified from Dr.Sagiv’s slides.
Unit 4 Representing Web Data: XML
G52IWS: Extensible Markup Language (XML)
Chapter 7 Representing Web Data: XML
CS 240 – Advanced Programming Concepts
Allyson Falkner Spokane County ISD
Review of XML IST 421 Spring 2004 Lecture 5.
Presentation transcript:

XML Robert Grimm New York University

The Whirlwind So Far  HTTP  Persistent connections  (Style sheets)  Fast servers  Event driven architectures  Clusters  Availability metrics  Strategies for self-management, data replication, and load balancing  Caching  Zipf-like popularity distributions  Effectiveness of cooperative caching

Content: XML

The Essence of XML  External format for representing data  Two simple properties  Self-describing  Possible to derive internal representation from external one  Round-tripping  When converting from internal to external to internal the two internal representations are equal  Does XML have these properties? No!  “So, the essence of XML is this: the problem it solves is not hard, and it does not solve the problem well.”

XML The Standards Soup  Basic XML  XML 1.0  Namespaces in XML  XML Information Set  Typing XML documents  DTDs (part of XML 1.0)  XML Schema  Querying XML documents  XPath  XQuery

XML Basic Ingredients  Elements ,, Something  Attributes   Character data  Character data goes here  Entity references  < & > " &apos;

XML Basic Ingredients (cont.)  Raw character data  <![CDATA[ Some text here ]]  Comments   Processing instructions 

An XML Document  XML Declaration   One root element  All other elements must be nested, never overlap  All attribute values must be quoted  No element may have more than one attribute with a given name  Comments and processing instructions may not appear in tags  No unescaped < or & signs

Internationalization  XML documents contain Unicode text  But they may still have different encodings  UCS-2, UTF-16, UTF-8, ISO , Cp1252, MacRoman  Parsers look for #xFEFF, #xFFFE, #xEFBBBF, #x3C3F786D  Element names may contain any letter   Character data may use character references  њ or &#x45A to refer to њ  Elements may have an xml:lang attribute  λογος

Typing XML Documents Take 1: DTDs  A special syntax to define  Element nesting  Element occurrence constraints  Character data occurrence constraints  Permitted attributes  Attribute types and default values  More entities

Typing XML Documents Take 2: XML Schema  Why XML Schema?  Not a special syntax, just XML  More expressive  Precise control over element & attribute content

XML Schema from 1,000,000 Feet  Simple types  19 of them, including booleans, integers, and strings  Complex types  Atomic, list, and union types  Derivation by restriction  Derivation by extension  Support for global and local declarations  That’ it…

XML Schema Formalization Concepts  Named types  Structural types  Validation  Matching  Erasure  Relation  Function

XML Namespaces  Motivation  We want to mix different document types in the same document  E.g., XHTML document that also contains SVG and MathML  The basic idea  Associate each element or attribute name with a namespace  Namespaces are identified by URIs  Essentially, URIs serve a opaque tokens  However, it is good practice to point to documentation

XML Namespaces (cont.)  URIs are long, contain illegal characters (/,%,~)  Use qualified names (consisting of prefix + local part)  rdf:description, xlink:type, xsl:template  Bind prefixes to URIs  xmlns:rdf=“  Support default namespace  xmlns=“

Parsing XML  In general, writing parsers for external representations is painful  Parsers for XML (may) reduce the tedium, check for  Well-formed content  Data adheres to XML syntax  Valid content  Data adheres to some type declaration  Think DTD, XML Schema

Common XML Parser APIs  Document Object Model (DOM)  Maintained by W3C  Tree-based  Exposes generic containers, allowing applications to traverse tree  Simple API for XML (SAX)  Coordinated by David Megginson, hosted by SourceForge  Event-based  Exposes parsing events directly to application through callbacks  Why and when use one or the other API?

SAX Setup  Create a parser  XMLReader xr = XMLReaderFactory.createXMLReader();  Configure parser  xr.setContentHandler(myContentHandler);  Configure features    Parse XML document  xr.parse(new InputSource(in)); 

SAX ContentHandler  The methods  setDocumentLocator(locator)  startDocument(), endDocument()  characters(ch, start, len), ignorableWhitespace(…)  startElement(uri, localName, qName, atts) endElement(uri, localName, qName)  startPrefixMapping(prefix, uri) endPrefixMapping(prefix, uri)  skippedEntity(name)  processingInstruction(target, data)  What’s missing from this API?

S-Expressions: A Much Simpler External Data Format  Pair: record structure with two fields (car, cdr)  (1. 2)  List: empty, or pair whose cdr is a list  (), (1 2 3)  Some basic Scheme types  Booleans  #t, #f  Strings  “This is a string”  Integers  123

So, Why Is XML So Popular?  Dare Obasanjo argues  Support for internationalization  Platform independence  Human-readable format  Extensibility  Large number of off-the-shelf tools  What do you think?