1 XML and QUERY Shilpi Ahuja CSE 591 - Data Mining 4 th April 2002.

Slides:



Advertisements
Similar presentations
XML I.
Advertisements

XML: text format Dr Andy Evans. Text-based data formats As data space has become cheaper, people have moved away from binary data formats. Text easier.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
Document Type Definition DTDs CS-328. What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML.
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
A Technical Introduction to XML Transparency No. 1 XML quick References.
Extensible Markup Language XML MIS 520 – Database Theory Fall 2001 (Day) Lecture 14.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
Tutorial 11 Creating XML Document
Introduction to XML Rashmi Kukanur. XML XML stands for Extensible Markup Language XML was designed to carry data XML and HTML designed with different.
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
4/20/2017.
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
XML-QL A Query Language for XML Charuta Nakhe
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
MIS 315 Bsharah An Introduction to XML 1MIS Bsharah.
CISC 3140 (CIS 20.2) Design & Implementation of Software Application II Instructor : M. Meyer Address: Course Page:
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Syntax - Writing XML and Designing DTD's
XHTML. Introduction to XHTML What Is XHTML? – XHTML stands for EXtensible HyperText Markup Language – XHTML is almost identical to HTML 4.01 – XHTML is.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
CP3024 Lecture 9 XML: Extensible Markup Language.
XML Extensible Markup Language Aleksandar Bogdanovski Programing Enviroment LABoratory
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
C# and Windows Programming XML Processing. 2 Contents Markup XML DTDs XML Parsers DOM.
Introduction to XML February 07, From HTML to XML As mentioned in previous classes, if you know HTML, then you already know XML… really! In this.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
Games: XML Presented by: Idham bin Mat Desa Mohd Sharizal bin Hamzah Mohd Radzuan bin Mohd Shaari Shukor bin Nordin.
XML Technology. Emerging Importance of XML –HTML-tagging is display oriented. –XML-based content tagging has important uses: data mining role-oriented.
Document Type Definition (DTD) Eugenia Fernandez IUPUI.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Copyrighted material John Tullis 3/18/2016 page 1 04/29/00 XML Part 4 John Tullis DePaul Instructor
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
Session III Chapter 6 – Creating DTDs
Creating an XML Document
New Perspectives on XML
CSE591: Data Mining by H. Liu
Session II Chapter 6 – Creating DTDs
Document Type Definition (DTD)
Presentation transcript:

1 XML and QUERY Shilpi Ahuja CSE Data Mining 4 th April 2002

2 What is XML? ( Extensible Markup Language ) A Markup language for structured documentation. A Structural and Semantic language, not a formatting language Not just for Web pages

3 HTML vs. XML External Presentation Xaver Roe Wikingerrufer Berlin XML Xaver Roe Wikingerufer 7 Berlin HTML Xaver Roe Wikingerufer 7 Berlin

4 Why Extensible Markup Language ) Language It has a grammar It has a vocabulary (sort of) It can be parsed by machines Markup Language A mechanism to identify structures in a document. It says what things are; not what they do It is not a programming language It is not compiled Extensible You can add words to the language

5 XML describes structure and semantics, not formatting XML documents form a tree Element and attribute names reflect the kind of the element Formatting can be added with a style sheet

6 So Is XML Just Like HTML? Discussion Question ?

7 Answer : No In HTML, both the tag semantics and the tag set are fixed. XML specifies neither semantics nor a tag set XML lets you define your own tags HTML describes lay-out XML describes the structure of a document XML separates content from presentation

8 So IS XML Just Like SGML? No. Well, yes, sort of ! XML is a much-restricted form of SGML It is defined as an application profile of SGML. SGML is not well suited to serving documents over the web

9 So Why XML ? XML was created so that richly structured documents could be used over the web HTML -- Bound with a set of semantics, no arbitrary structure SGML provides arbitrary structure, but is too difficult to implement just for a web browser

10 What is the advantage of using XML ? Discussion Question ?

11 A Simple XML Document Extensible Markup Language Proposed Jane Doe, Staff Writer The newly proposed XML Specification has been making a splash in the community. The newly proposed XML draft stands to revolutionize the exchange of easily. No Notes XML 1.0 Recommendation Released John E. Doe, Reporter The W3C today released the final recommendation for XML XML Developers, are already using the released recommendation See for more information

12 Characteristics The document begins with a processing instruction:. Open and close all tags Empty tags end with /> There is a unique root element Elements may not overlap Attribute values are quoted < and & are only used to start tags and entities

13 Elements Most common form of markup. Example: Article, Headline, Byline are all elements Delimited by angle brackets, most elements identify the nature of the content they surround. Some elements may be empty i.e they’ve no content. A non-empty element always begins with a start-tag,, and ends with an end-tag,.

14 Attributes Attributes are name-value pairs that occur inside tags after the element name. For example, is the Article element with the attribute Editor having the value Ernie Pyle. In XML, all attribute values must be quoted.

15 Entity References Entities are used to represent special characters like left angle bracket, “<” They’re also used to refer to often repeated or varying text and to include the content of external files. Every entity must have a unique name Entity references begin with the ampersand and end with a semicolon.

16 Declaring & Referencing Entities Using &NEWSPAPER anywhere in the document inserts “Vervet Logic Times” at that location. Internal entities allows you to define shortcuts for frequently typed text or text that is expected to change, such as the revision status of a document.

17 Comments Comments begin with “ ”. Comments can contain any data except the literal string “--”. Comments are not part of the textual content of an XML document. An XML processor is not required to pass them along to an application

18 DTD ( Document Type Definition ) Formally identifies the relationships between the various elements that form the document. Can express constraints on the sequence and nesting of tags. Can express constraints on attribute values and their types and defaults The names of external files that may be referenced, the formats of some external (non- XML) data that may be included, and entities that may be encountered.

19 <!ATTLIST ARTICLE AUTHOR CDATA #REQUIRED EDITOR CDATA #IMPLIED DATE CDATA #IMPLIED EDITION CDATA #IMPLIED> ELEMENT symbols * as many times as need + at least once ? once or not at all, must be in listed order | either one or other, any order ATRIBUTE option # REQUIRED – must be # IMPLIED – can be Attribute Data Type CDATA – character data ENUMARATED – list of values ID – Unique ID IDREF, IDREFS – referred value ENTITY, ENTITIES – binary data NMTOKEN, NMTOKENS, NOTATION ELEMENT Data Type # PCDATA – any characters The Newspaper DTD

20 Types of declarations in XML Element declarations Attribute list declarations Entity declarations Notation declarations.

21 Element Declarations Identifies the names of elements and the nature of their content Example An Article must contain Headline,Byline,Lead, Body and may contain Notes

22 ELEMENT DATA TYPE ( PCDATA ) Parseable character data Example : The vertical bar indicates an “or” relationship The asterisk indicates that the content is optional (may occur zero or more times) Byline may contain zero or more characters and quote tags.

23 Attribute Declarations Identify which elements may have attributes What attributes they may have What values the attributes may hold What default value each attribute has.

24 Attributes : Example <!ATTLIST ARTICLE AUTHOR ID #REQUIRED EDITOR CDATA #IMPLIED STATUS ( funny | notfunny ) 'funny'> Author, which is an ID and is required; Editor, which is a string is not required Status, which must be either funny or notfunny and defaults to funny if not specified.

25 Types of Attributes CDATA ID IDREF or IDREFS ENTITY or ENTITIES NMTOKEN or NMTOKENS A list of names

26 Types of Default Values #REQUIRED #IMPLIED "value" #FIXED "value"

27 Notation Declarations Identify specific types of external binary data. This information is passed to the processing application, which may make whatever use of it. A typical notation declaration is:

28 XML-QL: A Query Language for XML Designed in the AT&T Labs XML-QL has SELECT-WHERE construct, like SQL It borrows features of query languages recently developed by the database research community for semi-structured data. XML-QL can express queries, which extract pieces of data from XML documents

29 Features of XML-QL Declarative : like SQL. Relational complete : It can express joins. Easy implementation Data Extraction: XML-QL can extract data from existing XML documents and construct new XML documents. Views: Supports both ordered and unordered views on an XML document. Availability : XML-QL is implemented as a prototype and is freely available in a Java version.

30 Features of XML-QL Path Expressions: Supports partially specified path expressions. Building new Elements: Supports creation of new elements Combining Data Sources: Supports querying several data sources at the same time Negation: XML-QL doesn’t support negation Aggregation: Doesn’t support aggregate functions like min, max, sum, count and avg. Update Language: XML-QL doesn’t provide any support for insert, delete and update of elements

31 Queries in XML-QL Query 1: Produce all editors of the articles where author is John Doe Feature Exploited: Selection, Projection and Data Extraction on element values

32 Query Function query() { CONSTRUCT { WHERE "John Doe" $b IN "newspaper.xml" CONSTRUCT $b }

33 Query Output OUTPUT: Ernie Pyle

34 Explanation This query matches every element in the XML document newspaper.xml that has atleast one element and a element and author name is “John Doe”. For each such match, it binds the variable b to the editor. The result is the list of editors bound to b.

35 Discussion Question ? Can XML be used for things besides the Internet?