Chapter 7 © 2010 by Addison Wesley Longman, Inc. 1 7.1 Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In 1986 -

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML: Extensible Markup Language
An Introduction to XML Based on the W3C XML Recommendations.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
Document Type Definitions
COS 381 Day 17. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
Chapter 10 © 2001 by Addison Wesley Longman, Inc. 1 Chapter 10 Sebesta: Programming the World Wide Web.
COS 381 Day 15. Agenda Assignment 3 Corrected 3 A’s, 1 B, 1 C and 2 D’s Assignment 4 posted Due March 28 Those that wanted to resubmit Assignment 2 must.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Tutorial 11 Creating XML Document
COS 381 Day 14. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
COS 381 Day 16. Agenda Assignment 4 posted Due April 1 There was no resubmits of Assignment Capstone Progress report Due March 24 Today we will discuss.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Introduction to XML This material is based heavily on the tutorial by the same name at
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Chapter 8 Introduction to XML. © 2006 Pearson Addison-Wesley. All rights reserved Introduction - SGML is a meta-markup language - Developed in.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Chapter 7 © 2013 by Pearson Introduction - SGML is a meta-markup language - HTML was developed using SGML in the early 1990s - specifically for.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
Chapter 8 © 2003 by Addison-Wesley, Inc. 1 Chapter 8 Introduction to XML.
E X TENSIBLE M ARKUP L ANGUAGE (XML). What is XML?  XML stands for EXtensible Markup Language  XML is mainly designed to carry (or transmit) data, not.
Dr. Azeddine Chikh IS446: Internet Software Development.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
Introduction to XML. What is XML? Extensible Markup Language XML Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Syntax - Writing XML and Designing DTD's
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
1 Introduction to Web Application Introduction to XML.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Dr. Ahmet Cengizhan Dirican BIL 374 Internet Technologies 7. XML.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter X IXXXXXXXXXX XXXXXX.
IST 221 Internet Concepts and Applications Introduction to XML II CSS and XSLT Style Sheets.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
New Perspectives on XML, 2nd Edition
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Chapter 7 © 2014 by Pearson Education Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In HTML was developed.
Chapter 8 Introduction to XML. © 2006 Pearson Addison-Wesley. All rights reserved Introduction - SGML is a meta-markup language - Developed in.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
An Introduction to XML Sandeep Bhattaram
COS 381 Day 14. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
CS3101: Internet Programming Chapter 06: XML and RSS.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Dr. Ahmet Cengizhan Dirican BIL 374 Internet Technologies 7. XML.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Chapter 8 Introduction to XML.
Unit 4 Representing Web Data: XML
XML QUESTIONS AND ANSWERS
XML in Web Technologies
CHAPTER 5 EXTENSIBLE MARKUP LANGUAGE (XML)
Chapter 7 Representing Web Data: XML
Chapter X IXXXXXXXXXXXXXXXX.
7.1 Introduction - SGML is a meta-markup language
Presentation transcript:

Chapter 7 © 2010 by Addison Wesley Longman, Inc Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In HTML was developed using SGML in the early 1990s - specifically for Web documents - Two problems with HTML: 1. Fixed set of tags and attributes - User cannot define new tags or attributes - So, the given tags must fit every kind of document, and the tags cannot connote any particular meaning 2. There are few restrictions on arrangement or order of tag appearance in a document - One solution to the first of these problems: Let each group of users define their own tags (with implied meanings) (i.e., design their own “HTML”s using SGML)

Chapter 7 © 2010 by Addison Wesley Longman, Inc Introduction (continued) - Problem with using SGML: - It’s too large and complex to use, and it is very difficult to build a parser for it - A better solution: Define a lite version of SGML - XML is not a replacement for HTML - HTML is a markup language used to describe the layout of any kind of information - XML is a meta-markup language that can be used to define markup languages that can define the meaning of specific kinds of information - XML is a very simple and universal way of storing and transferring data of any kind - XML does not predefine any tags - XML has no hidden specifications - All documents described with an XML-derived markup language can be parsed with a single parser

Chapter 7 © 2010 by Addison Wesley Longman, Inc Introduction (continued) - We will refer to an XML-based markup language as a tag set - Strictly speaking, a tag set is an XML application, but that terminology can be confusing - An XML processor is a program that parses XML documents and provides the parts to an application - A document that uses an XML-based markup language is an XML document 7.2 The Syntax of XML - The syntax of XML is in two distinct levels: 1. The general low-level rules that apply to all XML documents 2. For a particular XML tag set, either a document type definition (DTD) or an XML schema

Chapter 7 © 2010 by Addison Wesley Longman, Inc The Syntax of XML (continued) - General XML Syntax - XML documents consist of: 1. data elements 2. markup declarations (instructions for the XML parser) 3. processing instructions (for the application program that is processing the data in the document) - All XML documents begin with an XML declaration: - XML names: - Must begin with a letter or an underscore - They can include digits, hyphens, and periods - There is no length limitation - They are case sensitive (unlike HTML names)

Chapter 7 © 2010 by Addison Wesley Longman, Inc The Syntax of XML (continued) - Syntax rules for XML: same as those of XHTML - Every XML document defines a single root element, whose opening tag must appear as the first line of the document - An XML document that follows all of these rules is well formed 1960 Cessna Centurian Yellow with white trim Gulfport Mississippi

Chapter 7 © 2010 by Addison Wesley Longman, Inc The Syntax of XML (continued) - Attributes are not used in XML the way they are in HTML - In XML, you often define a new nested tag to provide more info about the content of a tag - Nested tags are better than attributes, because attributes cannot describe structure and the structural complexity may grow - Attributes should always be used to identify numbers or names of elements (like HTML id and name attributes)

Chapter 7 © 2010 by Addison Wesley Longman, Inc The Syntax of XML (continued)... Maggie Dee Magpie... <!-- A tag with one nested tag, which contains three nested tags --> Maggie Dee Magpie...

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Document Structure - An XML document often uses two auxiliary files: - One to specify the structural syntactic rules - One to provide a style specification - An XML document has a single root element, but often consists of one or more entities - Entities range from a single special character to a book chapter - An XML document has one document entity - Reasons for entity structure: 1. Large documents are easier to manage 2. Repeated entities need not be literally repeated 3. Binary entities can only be referenced in the document entities (XML is all text!)

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Document Structure (continued) - When the XML parser encounters a reference to a non-binary entity, the entity is merged in - Entity names: - No length limitation - Must begin with a letter, a dash, or a colon - Can include letters, digits, periods, dashes, underscores, or colons - A reference to an entity has the form: & entity_name ; - Predefined entities (as in XHTML): < < > > & & " " ' &apos;

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Document Structure (continued) - If several predefined entities must appear near each other in a document, it is better to avoid using entity references - Character data section e.g., instead of Start > > > > HERE < < < < use >>> HERE - If the CDATA content has an entity reference, it is taken literally

Chapter 7 © 2010 by Addison Wesley Longman, Inc Document Type Definitions - A DTD is a set of structural rules called declarations - These rules specify a set of elements, along with how and where they can appear in a document - Purpose: provide a standard form for a collection of XML documents and define a markup language for them - The DTD for a document can be internal or external - All of the declarations of a DTD are enclosed in the block of a DOCTYPE markup declaration - DTD declarations have the form: - There are four possible declaration keywords: ELEMENT, ATTLIST, ENTITY, and NOTATION

Chapter 7 © 2010 by Addison Wesley Longman, Inc Document Type Definitions (continued) - Declaring Elements - An element declaration specifies the name of an an element, and the element’s structure - If the element is a leaf node of the document tree, its structure is in terms of characters - If it is an internal node, its structure is a list of children elements (either leaf or internal nodes) - General form: e.g., memo fromtodaterebody

Chapter 7 © 2010 by Addison Wesley Longman, Inc Document Type Definitions (continued) - Declaring Elements (continued) - Child elements can have modifiers, +, *, ? e.g., <!ELEMENT person (parent+, age, spouse?, sibling*)> - Leaf nodes specify data types, most often PCDATA, which is an acronym for parsable character data - Data type could also be EMPTY (no content) and ANY (can have any content) - Example of a leaf declaration: - Declaring Attributes - General form:

Chapter 7 © 2010 by Addison Wesley Longman, Inc Document Type Definitions (continued) - Declaring Attributes (continued) - Attribute types: there are ten different types, but we will consider only CDATA - Default values: a value #FIXED value (every element will have this value), #REQUIRED (every instance of the element must have a value specified), or #IMPLIED (no default value and need not specify a value)...

Chapter 7 © 2010 by Addison Wesley Longman, Inc Document Type Definitions (continued) - Declaring Entities - Two kinds: - A general entity can be referenced anywhere in the content of an XML document - A parameter entity can be referenced only in a markup declaration - General form of declaration: e.g., - A reference: &jfk; - If the entity value is longer than a line, define it in a separate file (an external text entity)  SHOW planes.dtd

Chapter 7 © 2010 by Addison Wesley Longman, Inc Document Type Definitions (continued) - XML Parsers - Always check for well formedness - Some check for validity, relative to a given DTD - Called validating XML parsers - You can download a validating XML parser from: - Internal DTDs <!DOCTYPE root_name [ … ]> - External DTDs <!DOCTYPE XML_doc_root_name SYSTEM “DTD_file_name” >

Chapter 7 © 2010 by Addison Wesley Longman, Inc Namespaces - A markup vocabulary is the collection of all of the element types and attribute names of a markup language (a tag set) - An XML document may define its own tag set and also use those of another tag set - CONFLICTS! - An XML namespace is a collection of names used in XML documents as element types and attribute names - The name of an XML namespace has the form of a URI - A namespace declaration has the form: - The prefix is a short name for the namespace, which is attached to names from the namespace in the XML document - In the document, you can use - Purposes of the prefix: 1. Shorthand 2. URI includes characters that are illegal in XML

Chapter 7 © 2010 by Addison Wesley Longman, Inc Namespaces (continued) - Can declare two namespaces on one element <gmcars xmlns:gm = " xmlns:html = " - The gmcars element can now use gm names and xhtml names - One namespace can be made the default by leaving the prefix out of the declaration 7.6 XML Schemas - Problems with DTDs: 1. Syntax is different from XML - cannot be parsed with an XML parser 2. It is confusing to deal with two different syntactic forms 3. DTDs do not allow specification of particular kinds of data

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - XML Schemas is one of the alternatives to DTD - Two purposes: 1. Specify the structure of its instance XML documents 2. Specify the data type of every element and attribute of its instance XML documents - Schemas are written using a namespace: - Every XML schema has a single root, schema The schema element must specify the namespace for schemas as its xmlns:xsd attribute - Every XML schema itself defines a tag set, which must be named targetNamespace = "

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - If we want to include nested elements, we must set the elementFormDefault attribute to qualified - The default namespace must also be specified xmlns = " - A complete example of a schema element: <xsd:schema xmlns:xsd = " <!-- Namespace where elements defined here will be placed --> targetNamespace = " xmlns = " <!-- Next, specify non-top-level elements to be in the target namespace --> elementFormDefault = "qualified">

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - Defining an instance document - The root element must specify the namespaces it uses 1. The default namespace 2. The standard namespace for instances ( XMLSchema-instance ) 3. The location where the default namespace is defined, using the schemaLocation attribute, which is assigned two values <planes xmlns = " xmlns:xsi = xsi:schemaLocation = " planes.xsd" > - Data Type Categories 1. Simple (strings only, no attributes and no nested elements) 2. Complex (can have attributes and nested elements)

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - XMLS defines 44 data types - Primitive: string, Boolean, float, … - Derived: byte, decimal, positiveInteger, … - User-defined (derived) data types – specify constraints on an existing type (the base type) - Constraints are given in terms of facets ( totalDigits, maxInclusive, etc.) - Both simple and complex types can be either named or anonymous - DTDs define global elements (context is irrelevant) - With XMLS, context is essential, and elements can be either: 1. Local, which appears inside an element that is a child of schema, or 2. Global, which appears as a child of schema

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - Defining a simple type: - Use the element tag and set the name and type attributes <xsd:element name = "bird" type = "xsd:string" /> - An instance could have: Yellow-bellied sap sucker - Element values can be constant, specified with the fixed attribute fixed = "three-toed" - User-Defined Types - Defined in a simpleType element, using facets specified in the content of a restriction element - Facet values are specified with the value attribute

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - There are several categories of complex types, but we discuss just one, element-only elements - Element-only elements are defined with the complexType element - Use the sequence tag for nested elements that must be in a particular order - Use the all tag if the order is not important

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) <xsd:element name = "make" type = "xsd:string" /> <xsd:element name = "model " type = "xsd:string" /> <xsd:element name = "engine" type = "xsd:string" /> <xsd:element name = "year" type = "xsd:string" /> - Nested elements can include attributes that give the allowed number of occurrences ( minOccurs, maxOccurs, unbounded )  SHOW planes.xsd and planes1.xml - We can define nested elements elsewhere

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Schemas (continued) - The global element can be referenced in the complex type with the ref attribute - Validating Instances of XML Schemas - One validation tool is xsv, which is available from: - Note: If the schema is incorrect (bad format), xsv reports that it cannot find the schema 7.7 Displaying Raw XML Documents - An XML browser should have a default style sheet for an XML document that does not specify one - You get a stylized listing of the XML  SHOW planes.xml

Chapter 7 © 2010 by Addison Wesley Longman, Inc Displaying XML Documents with CSS - A CSS style sheet for an XML document is just a list of its tags and associated styles - The connection of an XML document and its style sheet is made through an xml-stylesheet processing instruction <?xml-stylesheet type = "text/css" href = "mydoc.css"?> --> SHOW planes.css and run planes.xml 7.9 XSLT Style Sheets - XSL began as a standard for presentations of XML documents - Split into three parts: - XSLT – Transformations - XPATH - XML Path Language - XSL-FO - Formatting objects - XSLT uses style sheets to specify transformations

Chapter 7 © 2010 by Addison Wesley Longman, Inc XSLT Style Sheets (continued) - An XSLT processor merges an XML document into an XSLT document (a style sheet) to create an XSL document - This merging is a template-driven process - An XSLT style sheet can specify page layout, page orientation, writing direction, margins, page numbering, etc. - The processing instruction we used for connecting a XSLT style sheet to an XML document is used to connect an XSLT style sheet to an XML document <?xml-stylesheet type = "text/xsl" href = " XSLT style sheet "?> - An example: <?xml-stylesheet type = "text/xsl" href = "xslplane.xsl" ?> 1977 Cessna Skyhawk Light blue and white

Chapter 7 © 2010 by Addison Wesley Longman, Inc XSLT Style Sheets (continued) - An XSLT style sheet is an XML document with a single element, stylesheet, which defines namespaces <xsl:stylesheet xmlns:xsl = " xmlns = " - If a style sheet matches the root element of the XML document, it is matched with the template: - XSLT documents include two different kinds of elements, those with content and those for which the content will be merged from the XML doc - Elements with content often represent HTML elements Happy Easter!

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Transformations and Style Sheets (continued) - XSLT elements that represent HTML elements are simply copied to the merged document - The XSLT value-of element - Has no content - Uses a select attribute to specify part of the XML data to be merged into the new document - The value of select can be any branch of the document tree  SHOW xslplane1.xsl and display xslplane.xml - xslplane1.xsl is more complex than necessary  SHOW xslplane2.xsl - The XSLT for-each element - Used when an XML document has a sequence of the same elements  SHOW xslplanes.xml, xslplanes.xsl & display

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Processors - Purposes: 1. Check the syntax of a document for well- formedness 2. Replace all references to entities by their definitions 3. Copy default values (from DTDs or schemas) into the document 4. If a DTD or schema is specified and the processor includes a validating parser, the structure of the document is validated - Two ways to check well-formedness: 1. A browser with an XML parser 2. A stand-alone XML parser - There are two different approaches to designing XML processors: - SAX and the DOM approach

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Processors (continued) - The SAX (Simple API for XML) Approach: - Widely accepted and supported - Based on the concept of event processing: - Every time a syntactic structure (e.g., a tag, an attribute, etc.) is recognized, the processor raises an event - The application defines event handlers to respond to the syntactic structures - The DOM Approach - The DOM processor builds a DOM tree structure of the document (Similar to the processing by a browser of an XHTML document) - When the tree is complete, it can be traversed and processed

Chapter 7 © 2010 by Addison Wesley Longman, Inc XML Processors (continued) - Advantages of the DOM approach: 1. Good if any part of the document must be accessed more than once 2. If any rearrangement of the document must be done, it is facilitated by having a representation of the whole document in memory 3. Random access to any part of the document is possible 4. Because the whole document is parsed before any processing takes place, processing of an invalid document is avoided - Disadvantages of the DOM approach: 1. Large documents require a large memory 2. The DOM approach is slower - Note: Most DOM processors use a SAX front end

Chapter 7 © 2010 by Addison Wesley Longman, Inc Web Services - The ultimate goal of Web services: - Allow different software in different places, written in different languages and resident on different platforms, to connect and interoperate - The Web began as provider of markup documents, served through the HTTP methods, GET and POST - An information service system - A Web service is closely related to an information service - The original Web services were provided via Remote Procedure Call (RPC), through two technologies, DCOM and CORBA - DCOM and CORBA use different protocols, which defeats the goal of universal component interoperability - There are three roles required to provide and use Web services: 1. Service providers 2. Service requestors 3. A service registry

Chapter 7 © 2010 by Addison Wesley Longman, Inc Web Services (continued) - Web Service Definition Language (WSDL) - Used to describe available services, as well as of message protocols for their use - Universal Description, Discovery, and Integration Service (UDDI) - Used to create Web services registry, and also methods that allow a remote system to determine which services are available - Standard Object Access Protocol (SOAP) - An XML-based specification that defines the forms of messages and RPCs - Supports the exchange of information among distributed systems - A SOAP message is an XML document that includes an envelope - The body of a SOAP message is either a request or a response