Foundational Data Modeling and Schema Transformations for XML Data Engineering Stephen W. Liddle Information Systems Department Reema Al-Kamha & David.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management Tenth Edition
Advertisements

Introduction to Databases
1 XML DTD & XML Schema Monica Farrow G30
Conceptual XML for Systems Analysis Reema Al-Kamha PhD Proposal Supported by NSF.
Transforming XML Schema to Conceptual XML Reema Al-Kamha Spring Research Conference Supported by NSF.
XNF: 1 XML and NNF A Standard Form for XML Documents (XNF) Properties –As few hierarchical trees as possible –No redundant data values in any tree Method.
ER2004, Shanghai, China Enterprise Modeling with Conceptual XML Stephen W. Liddle Rollins Center for eBusiness and School of Accountancy & Information.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
A Simple Schema Design. First Schema Design Being a Dog Is a Full-Time Job Charles M. Schulz Snoopy Peppermint Patty extroverted beagle Peppermint.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
File Systems and Databases
Augmenting Traditional Conceptual Models to Accommodate XML Structural Constructs Reema Al-Kamha Spring Research Conference Supported by NSF.
Fundamentals, Design, and Implementation, 9/e Chapter 3 Entity-Relationship Data Modeling: Process and Examples Instructor: Dragomir R. Radev Fall 2005.
Conceptual XML for Systems Analysis Reema Al-Kamha PhD Dissertation Defense Supported by NSF.
TU/e eindhoven university of technology / faculty of mathematics and informatics Exporting Databases in XML DTD A Conceptual and Generic Approach Philippe.
Augmenting Traditional Conceptual Models to Accommodate XML Structures Stephen W. Liddle Information Systems Department Reema Al-Kamha & David W. Embley.
Producing XML Documents with Guaranteed “Good” Properties David W. Embley Brigham Young University Wai Y. Mok University of Alabama in Huntsville Sponsored.
XNF-1 XML and NNF A Standard Form for XML Documents (XNF) Properties –As few hierarchical trees as possible –No redundant data values in any tree Method.
Conceptual XML for Systems Analysis Reema Al-Kamha Spring Research Conference Supported by NSF.
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
Class Number – CS 304 Class Name - DBMS Instructor – Sanjay Madria Instructor – Sanjay Madria Lesson Title – ER Model.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
BIS310: Week 7 BIS310: Structured Analysis and Design Data Modeling and Database Design.
Software Design Description (SDD) Diagram Samples
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Data Modeling Using the Entity-Relationship Model
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
CSE314 Database Systems Data Modeling Using the Entity- Relationship (ER) Model Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Information storage: Introduction of database 10/7/2004 Xiangming Mu.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Ertan Deniz Instructor.  XML Schema  Document Navigation (Xpath)  Document Transformation (XSLT)
CSE4500 Information Retrieval Systems XML Schema – Part 1.
Skip 2007 Current Issues in MIS The XML Language Foundation f - Clinton E. White, Jr Professor of Accounting & MIS Lerner College of B&E University.
2 1 Chapter 2 Data Models Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Ninth Edition
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
Querying Structured Text in an XML Database By Xuemei Luo.
IVOA Registry videocon 2004/05/13-14 Gerard Lemson1 Model based schema.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
UNIT 2.
 XML DTD and XML Schema Discussion Sessions 1A and 1B Session 2.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Declaratively Producing Data Mash-ups Sudarshan Murthy 1, David Maier 2 1 Applied Research, Wipro Technologies 2 Department of Computer Science, Portland.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
XML Schema. Why Validate XML? XML documents can generally have any structure XML grammars define specific document structures Validation is the act of.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Mapping RDB Schema to.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
Computing & Information Sciences Kansas State University Friday, 26 Sep 2008CIS 560: Database System Concepts Lecture 13 of 42 Friday, 26 September 2008.
Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: (x2073)
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Deriving Complex Types In XML Schema By: Roy Navon.
Introduction to Active Directory
XML Schema Lecture 3. Indicators There are seven indicators: Order indicators: All Choice Sequence Occurrence indicators: maxOccurs minOccurs Group indicators:
XML Validation II Advanced DTDs + Schemas Robin Burke ECT 360.
Lecture 23 XQuery 1.0 and XPath 2.0 Data Model. 2 Example 31.7 – User-Defined Function Function to return staff at a given branch. DEFINE FUNCTION staffAtBranch($bNo)
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
CITA 330 Section 4 XML Schema. XML Schema (XSD) An alternative industry standard for defining XML dialects More expressive than DTD Using XML syntax Promoting.
Web Services: Principles & Technology Slide 3.1 Chapter 3 Brief Overview of XML COMP 4302/6302.
The Enhanced Entity- Relationship (EER) Model
CMP 051 XML Introduction Session IV
Chapter 7: Entity-Relationship Model
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
CMP 051 XML Introduction Session III
Presentation transcript:

Foundational Data Modeling and Schema Transformations for XML Data Engineering Stephen W. Liddle Information Systems Department Reema Al-Kamha & David W. Embley Computer Science Department Brigham Young University, Provo, Utah

2 24 April 2008 UNISCON 2008, Klagenfurt, Austria XML Data Engineering Model XML conceptually Map conceptual models to XML Reverse-engineer XML to conceptual models Ensure properties Information preserving transformations Constraint preserving transformations Redundancy-free guarantees

3 C-XML 24 April 2008 UNISCON 2008, Klagenfurt, Austria

4 24 April 2008 UNISCON 2008, Klagenfurt, Austria Modeling XML Conceptually Scaling the mountain of abstraction Delicate balance Enough modeling constructs But not to many  High-level capture of essentials  Avoidance of low-level implementation details Formal but easily understood XML needs better abstractions

5 24 April 2008 UNISCON 2008, Klagenfurt, Austria XML Schema/Model Mismatch XML features not explicitly supported in traditional conceptual models: Ordered lists of concepts Choice of concept from among several Mixed content Use of content from another model Nested information hierarchies C-XML

6 24 April 2008 UNISCON 2008, Klagenfurt, Austria Missing Modeling Constructs (1) Sequence structure Parent concept Ordered child concepts Constrained recurrence of children Constrained recurrence of sequence itself <xs:element name="MiddleName" type="xs:string“ minOccurs="0" maxOccurs="2"/>

7 Missing Modeling Constructs (1) 24 April 2008 UNISCON 2008, Klagenfurt, Austria

8 24 April 2008 UNISCON 2008, Klagenfurt, Austria Missing Modeling Constructs (2) Choice structure Parent concept Choose one child concept from several alternatives Constrained recurrence of chosen child Constrained recurrence of choice itself <xs:element name="PhoneNumber" type="xs:string" minOccurs="1" maxOccurs="2" />

9 24 April 2008 UNISCON 2008, Klagenfurt, Austria Missing Modeling Constructs (3) Mixed attribute Allows character and element data to be intertwined Any and anyAttribute structures Insert structures from other namespaces Constrained recurrence

10 24 April 2008 UNISCON 2008, Klagenfurt, Austria Missing Modeling Constructs (4) Nesting of hierarchical structures Key organizational characteristic of XML Arbitrarily complex nesting possible

11 C-XML Example 24 April 2008 UNISCON 2008, Klagenfurt, Austria

12 C-XML TO XML SCHEMA 24 April 2008 UNISCON 2008, Klagenfurt, Austria

13 exists [0:*] (Course(x) Student(x1) Semester(x2) Grade(x3) )) --> C-XMLXML Schema

14 Algorithm Overview Generate a forest of scheme trees Translate an individual object set Translate scheme-tree collections of object sets Create a root node Add uniqueness constraints Translate generalization/specialization hierarchies

15 (Student, StudentID, StudentName, FirstName, LastName, (MiddleName)*, (Course, Semester, Grade)*)* Generate Scheme Trees

16 (Course, Department)* Generate Scheme Trees

17 (GradStudent, Advisor)*(UndergradStudent)* Generate Scheme Trees

18 (Student, StudentID, StudentName, FirstName, LastName, (MiddleName)*, (Course, Semester, Grade)*)* (Course, Department)*(GradStudent, Advisor)*(UndergradStudent)* Generate Scheme Trees

19 Student, StudentID, StudentName, FirstName, LastName MiddleName Course, Semester, Grade Course, Department GradStudent, Advisor UndergradStudent (Student, StudentID, StudentName, FirstName, LastName, (MiddleName)*, (Course, Semester, Grade)*)* (Course, Department)*(GradStudent, Advisor)*(UndergradStudent)* Generate Scheme Trees

20 Individual Object Sets...

21 Scheme-Tree Translation Students CoursesGradStudents UndergradStudents MiddleNames Course-Semester-GradesMiddleNames Students Student MiddleName CourseGradStudent UndergradStudent Course-Semester-Grade

22 Scheme-Tree Translation... <xs:element name="Semester-Course-Grade" minOccurs="0" maxOccurs="unbounded">......

23 Scheme-Tree Translation exists [0:*] (Course(x) Student(x1) Semester(x2) Grade(x3) )) -->

24

25 Root Element Students CoursesGradStudentsUndergradStudents......

26 Uniqueness Constraints <xs:element name="Student" maxOccurs="unbounded">...

27 Generalization/Specialization

28 XML SCHEMA TO C-XML 24 April 2008 UNISCON 2008, Klagenfurt, Austria

29 XML Schema C- XML

30 Algorithm Overview Generate object sets for each element & attribute Specify built-in and simple types in data frames Obtain relationship sets from parent-child connections Obtain participation constraints from minOccurs, maxOccurs, and use constraints

31 Attribute Transformation

32 Element Transformation

33 Choice Transformation

34 Sequence Transformation

35 Key Constraints Transformation

36 Substitution Group & Extension Transformation

37 Observation on Transformations These transformations to and from C-XML are not inverses of one another However, C-XML XML Schema C-XML XML Schema

38 Demo 24 April 2008 UNISCON 2008, Klagenfurt, Austria

39 PROPERTY GUARANTEES 24 April 2008 UNISCON 2008, Klagenfurt, Austria

40 Transformation Properties: C-XML to XML Schema Theorem 1: … preserves information. Proof: injective Theorem 2: Allowing for pragma constraints, … preserves constraints. Proof: by construction Theorem 3: … yields an XML-Schema instance whose complying XML documents are redundancy free. Proof: [TKDE, Aug06] 24 April 2008 UNISCON 2008, Klagenfurt, Austria

41 Transformation Properties: XML Schema to C-XML Theorem 4: … preserves information. Proof: injective Theorem 5: … preserves constraints. Proof: by construction 24 April 2008 UNISCON 2008, Klagenfurt, Austria

42 24 April 2008 UNISCON 2008, Klagenfurt, Austria Conclusions C-XML models XML conceptually Transformations C-XML to XML Reverse-engineer XML to C-XML Properties Information preserving Constraint preserving Redundancy-free guarantee