Technical University of Valencia Computer Science Department SOFSEM’07 (22/01/2007) A Program Slicing Based Method to Filter XML/DTD documents.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

Standards and Increasing Maintainability on Web- based Systems James Eaton SE4112/16/2006.
XSLT (eXtensible Stylesheet Language Transformation) 1.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 13-1 COS 346 Day 24.
1 XML DTD & XML Schema Monica Farrow G30
XML Unit 6 October 31. XML, review XML is used to markup data Used to describe information Uses tags like HTML –But all tags are user-defined –Must be.
Tutorial 9 Working with XHTML
Review Writing XML  Style  Common errors 1XML Technologies David Raponi.
ModelicaXML A Modelica XML representation with Applications Adrian Pop, Peter Fritzson Programming Environments Laboratory Linköping University.
Winter 2002Arthur Keller – CS 18018–1 Schedule Today: Mar. 12 (T) u Semistructured Data, XML, XQuery. u Read Sections Assignment 8 due. Mar. 14.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
ECA 228 Internet/Intranet Design I Intro to XSL. ECA 228 Internet/Intranet Design I XSL basics W3C standards for stylesheets – CSS – XSL: Extensible Markup.
XP The University of Akron Summit College Business Technology Department Computer Information Systems 2440: 140 Internet Tools Instructor: Enoch E. Damson.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
XML: Overview MIS 181.9: Service Oriented Architecture 2 nd Semester,
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
Introduction technology XSL. 04/11/2005 Script of the presentation Introduction the XSL The XSL standard Tools for edition of codes XSL Necessary resources.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
Session II Chapter 2 – Chapter 2 – XSLhttp://
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Extensible Markup and Beyond
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Transformations Eugenia Fernandez IUPUI. Stylesheet Technologies Browser-based Presentation HTML Cascading Stylesheets Programming-based Transformation.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Introduction to GAIGS XML Scripting Integrating Algorithm Visualization into Computer Science Education Grand Valley State University June 13-16, 2006.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
JSTL, XML and XSLT An introduction to JSP Standard Tag Library and XML/XSLT transformation for Web layout.
Lecture 11 XSL Transformations (part 1: Introduction)
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
Semantically Processing The Semantic Web Presented by: Kunal Patel Dr. Gopal Gupta UNIVERSITY OF TEXAS AT DALLAS.
CHAPTER 15 WPF Windows Presentation Foundation Dr. John Abraham Professor, UTPA.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
COP 3813 Intro to Internet Computing Prof. Roy Levow XML.
Unit 3 — Advanced Internet Technologies Lesson 11 — Introduction to XSL.
SCHOOL OF LIBRARY, ARCHIVE AND INFORMATION STUDIES Andy Dawson LIS1510 Library and Archives Automation Issues XML and extensible systems Andy Dawson School.
Copyright © 2004 ProsoftTraining, All Rights Reserved. Lesson 2: Markup Language and Site Development Essentials © 2007 Prosoft Learning Corporation All.
Martin Kruliš by Martin Kruliš (v1.1)1.
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XSLT, XML Schema, and XPath Matt McClelland. Introduction XML Schema ▫Defines the content and structure of XML data. XSLT ▫Used to transform XML documents.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
1 Introduction to XML Babak Esfandiari. 2 What is XML? introduced by W3C in 98 Stands for eXtensible Markup Language it is more general than HTML, but.
1 XML and XML in DLESE Katy Ginger November 2003.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
XML: Extensible Markup Language
XML Related Technologies
XML QUESTIONS AND ANSWERS
XML in Web Technologies
Database Processing with XML
Prepared for Md. Zakir Hossain Lecturer, CSE, DUET Prepared by Miton Chandra Datta
Program Slicing Baishakhi Ray University of Virginia
CSE591: Data Mining by H. Liu
Unit 6 - XML Transformations
Presentation transcript:

Technical University of Valencia Computer Science Department SOFSEM’07 (22/01/2007) A Program Slicing Based Method to Filter XML/DTD documents

2 Motivation Program Slicing XML DTD XSLT Slicing XML Documents Example Implementation Conclusions & Future Work Contents Program Slicing

3 Definition: Definition: Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest. Origin: Origin: Originally introduced by Weiser. Example: (1) read(n); (2) i:=1; (3) sum:=0; (4) product:=1; (5) while (i<=n) do begin (6) sum:=sum+i; (7) product:=product*i; (8) i:=i+1; end; (9) write(sum); (10) write(product); Slicing Criterion = (10, product)

4 Program Slicing Definition: Definition: Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest. Origin: Origin: Originally introduced by Weiser. Example: (1) read(n); (2) i:=1; (3) sum:=0; (4) product:=1; (5) while (i<=n) do begin (6) sum:=sum+i; (7) product:=product*i; (8) i:=i+1; end; (9) write(sum); (10) write(product); Slicing Criterion = (10, product)

5 Program Slicing Applications: Debugging Code understanding Specialization etc. All the applications are based on the Program Dependence Graphs (PDGs) (structure and behaviour of programs) What would happen if Program Slicing was applied to a data structure? Would it be interesting?

6 Motivation Program Slicing XML DTD XSLT Slicing XML Documents Example Implementation Conclusions & Future Work Contents XML

7 Origin: Origin: XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in Structure: Structure: Documents are trees composed by ‘ELEMENTS’ which contain attributes. Example of XML document XML (eXtensible Markup Language)

8 XML Objective: Objective: The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. Structure: Structure: Documents are graphs composed by ‘ELEMENTS’. Example of DTD document DTD (Document Type Definition)

9 Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... <!ELEMENT PersonalInfo (Contact, Teaching, Research)> <!ELEMENT Contact (Status, Name, Surname)> <!ELEMENT Subject (Name, Sched, Course)> <!ATTLIST Project name CDATA #REQUIRED year CDATA #REQUIRED budget CDATA #IMPLIED > DTD (Document Type Definition) XML (eXtensible Markup Language)

10 XML Objective: Objective: XSLT is a language for transforming XML. Structure: Structure: An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary, such as (X)HTML or XSL-FO XSLT is a programming language Example of XSLT document (Source Code) XSLT (eXtensible Stylesheet Language Transformations) Example of XSLT document (Result)

11 Motivation Program Slicing XML DTD XSLT Slicing XML Documents Example Implementation Conclusions & Future Work Contents Slicing XML Documents

12 Slicing XML Documents We see XML documents and DTDs as trees. Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo ContactTeachingResearch …

13 Slicing XML Documents The Slicing Criterion is composed by a set of nodes in the tree. For each node in the slicing criterion, we extract from the tree all those nodes that are in the path from the root to the node. Web Page (Original) Web Page (Slice) XML / DTD Forward / Backward

14 NameSchedCourse Subject Status Name Surname NameYearBudget Project PersonalInfo ContactTeachingResearch Slicing XML Documents DTD backward slicing criterion. <!ELEMENT PersonalInfo (Contact, Teaching, Research)> <!ELEMENT Contact (Status, Name, Surname)> <!ELEMENT Subject (Name, Sched, Course)> <!ATTLIST Project name CDATA #REQUIRED year CDATA #REQUIRED budget CDATA #IMPLIED > NameSched Course Subject Status Name Surname NameYearBudget Project PersonalInfo Contact Teaching Research <!ELEMENT PersonalInfo (Contact, Teaching, Research)> <!ELEMENT Contact (Status, Name, Surname)> <!ELEMENT Subject (Name, Sched, Course)> <!ATTLIST Project name CDATA #REQUIRED year CDATA #REQUIRED budget CDATA #IMPLIED > Web Page (Original) Web Page (Slice)

15 Slicing XML Documents XML backward slicing criterion. Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo ContactTeachingResearch … Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo Contact Teaching Research … Web Page (Original) Web Page (Slice)

16 Slicing XML Documents XML backward slicing criterion. Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo Contact Teaching Research … Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Web Page (Original) Web Page (Slice)

17 Slicing XML Documents We distinguish between DTD and XML slicing criterions. XML slicing criterions are more fine-grained than DTD slicing criterions We distinguish between forward and backward slices (or a combination). Web Page (Original) Web Page (Slice) XML / DTD Forward / Backward

18 NameSchedCourse Subject Status Name Surname NameYearBudget Project PersonalInfo ContactTeachingResearch Slicing XML Documents DTD backward slicing criterion. <!ELEMENT PersonalInfo (Contact, Teaching, Research)> <!ELEMENT Contact (Status, Name, Surname)> <!ELEMENT Subject (Name, Sched, Course)> <!ATTLIST Project name CDATA #REQUIRED year CDATA #REQUIRED budget CDATA #IMPLIED > NameSchedCourse Subject Status Name Surname NameYearBudget Project PersonalInfo ContactTeaching Research <!ELEMENT PersonalInfo (Contact, Teaching, Research)> <!ELEMENT Contact (Status, Name, Surname)> <!ELEMENT Subject (Name, Sched, Course)> <!ATTLIST Project name CDATA #REQUIRED year CDATA #REQUIRED budget CDATA #IMPLIED > Web Page (Original) Web Page (Slice)

19 Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo ContactTeachingResearch … Slicing XML Documents XML forward slicing criterion. Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo Contact Teaching Research … Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Web Page (Original) Web Page (Slice)

20 Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo ContactTeachingResearch … Slicing XML Documents XML backward-forward slicing criterion. Logic Mon/Wed Mathematics Subject Algebra Mon/Tur Mathematics Professor Ryan Gibson Subject Syslog € Project … PersonalInfo Contact Teaching Research … Professor Ryan Gibson Logic Mon/Wed Mathematics Algebra Mon/Tur Mathematics … <Project name = “SysLog’’ year = “ ’’ budget = “16000€’’ />... Web Page (Original) Web Page (Slice)

21 Slicing XML Documents What happens with DTDs? Slices are well-formed, but are they valid? For each XML slice we produce a DTD slice and viceversa We guarantee that XML slices are valid with respect to DTD slices. DTD document Slicer XML document DTD Slice document XML Slice document Slicing Criterion

22 Slicing XML Documents A simple slicing algorithm

23 Slicing XML Documents In the case of a DTD criterion composed by a set of positions C = {p 1 …p n }  Pos(D), the algorithm would be the same, except that the first loop would be: For each v 1.v 2.(…).v n  C do V’ := V’  {v 1, v 1.v 2, …, v 1.v 2.(…).v n }; W’ := W’  {v 1 |i.v 2 |j.(…).v n |k} Where v 1.v 2.(…).v n  v’ and v1|i.v2|j.(…).vn|k  X Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24 Slicing XML Documents The following theorem states the correctness of the technique: Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D. Given a slice D’ of D and a slice X’ of X computed with an XML slicing criterion C, and given a slice D’’ of D and a slice X’’ of X computed with a DTD slicing criterion C’, then a) D’ is well-formed and X’ is valid with respect to D’ b) D’’ is well-formed and X’’ is valid with respect to D’’ If all the elements in C are of one of the types in C’, then c) D’ = D’’ d) X’ is a subtree of X’’

25 Motivation Program Slicing XML DTD XSLT Slicing XML Documents Example Implementation Conclusions & Future Work Contents Implementation

26 Implementation We have implemented a prototype in Haskell. Haskell provides us a formal basis with many advantages for the manipulation of XML documents. - The HaXml library. It allows us to automatically translate XML or HTML documents into a Haskell representation. In particular, we use the following data structures that can represent any XML/HTML document: data Element = Elem Name [Attribute] [Content] data Attribute = (Name, Value) data Content = CElem Element | CText String

27 XML XSLT WebPage (Data) (Presentation) Implementation From XML slices to Webpage slices XML XSLT WebPage (Data) (Presentation)

28 Implementation XSLT Implementation Guidelines XSLT documents must generate the information and the presentation elements under the same conditions (i.e., the former is generated if and only if the later is generated). Both the XML data and the presentation labels are generated together. This does not imposes any restriction on the power of XSLT, since the same webpages can be generated. On the contrary, this way of programming forces the programmer to build transformations that can be easily reused and maintained, because both the information and presentation data depending on the same condition are put together.

29 Implementation XSLT Implementation Guidelines

30 Implementation The implementation, some examples and other material is publicly available at:

31 Motivation Program Slicing XML DTD XSLT Slicing XML Documents Example Implementation Conclusions & Future Work Contents Conclusions & Future Work

32 Conclusions We proposed the application of program slicing techniques to XML data structures We defined an algorithm to slice XML and DTD documents XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest implementation effort Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files Future Work Migration to XML Schema New implementation based on XQuery