1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University.

Slides:



Advertisements
Similar presentations
The Entity-Relationship Model
Advertisements

Ch5: ER Diagrams - Part 1 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
Modeling the Data: Conceptual and Logical Data Modeling
Database Design & Mapping
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model.
Lecture Eleven Entity-Relationship Modelling
Entity-Relationship Model and Diagrams (continued)
The Entity-Relationship (ER) Model CS541 Computer Science Department Rutgers University.
Chapter 4 ENTITY-RELATIONSHIP MODELLING.
Chapter 4 Entity Relationship (E-R) Modeling
Chapter 4 Entity-Relationship modeling Transparencies © Pearson Education Limited 1995, 2005.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Powerpoint 2006 PRESENTATION The University of Auckland New Zealand Marsden Fund A PVS Approach to Verifying ORA-SS Data Models Scott Uk-Jin Lee 1, Gillian.
Tok Wang Ling1 Mong Li Lee1 Gillian Dobbie2
CS 405G Introduction to Database Systems
Chapter 3 Data Modeling Using the Entity- Relationship (ER) Model Dr. Bernard Chen Ph.D. University of Central Arkansas.
Data Modeling Using the Entity-Relationship Model
Entity-Relationship modeling Transparencies
Chapter 12 Entity-Relationship Modeling Pearson Education © 2009.
DeSiamorewww.desiamore.com/ifm1 Database Management Systems (DBMS)  B. Computer Science and BSc IT Year 1.
CSE314 Database Systems Data Modeling Using the Entity- Relationship (ER) Model Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Entity-relationship Modeling Transparencies 1. ©Pearson Education 2009 Objectives How to use ER modeling in database design. The basic concepts of an.
Dr. Mohamed Osman Hegaz1 Conceptual data base design: The conceptual models: The Entity Relationship Model.
1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore.
Web-Enabled Decision Support Systems
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
CSCI 3140 Module 2 – Conceptual Database Design Theodore Chiasson Dalhousie University.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
9/10/2012ISC 329 Isabelle Bichindaritz1 Entity Relationship (E-R) Modeling.
Concepts and Terminology Introduction to Database.
DASWIS NF-SS: A Normal Form for Semistructured Schemata Xiaoying Wu, Tok Wang Ling, Sin Yeung Lee, Mong Li Lee National University of Singapore.
10/3/2012ISC329 Isabelle Bichindaritz1 Logical Design.
ICS 321 Spring 2011 High Level Database Models Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 2/7/20111Lipyeow.
© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 2: Modeling Data in the Organization.
© Pearson Education Limited, Chapter 7 Entity-Relationship modeling Transparencies.
Entity-Relationship Modeling Based on Chapter 12.
Chapter 12 Entity-Relationship Modeling Pearson Education © 2009.
Initial Design of Entity Types for the COMPANY Database Schema Based on the requirements, we can identify four initial entity types in the COMPANY database:
DeSiamorePowered by DeSiaMore1 Database Management Systems (DBMS)  B. Computer Science and BSc IT Year 1.
Computing & Information Sciences Kansas State University Wednesday, 24 Sep 2008CIS 560: Database System Concepts Lecture 12 of 42 Wednesday, 24 September.
CS 405G: Introduction to Database Systems Lecture 2 : Database Design I.
Msigwaemhttp//:msigwaem.ueuo.com/1 Database Management Systems (DBMS)  B. Computer Science and BSc IT Year 1.
1 A Demo of Logical Database Design. 2 Aim of the demo To develop an understanding of the logical view of data and the importance of the relational model.
1 Entity-Relationship Model © Pearson Education Limited 1995, 2005.
Chapter 9 Logical Database Design : Mapping ER Model To Tables.
Computing & Information Sciences Kansas State University Friday, 26 Sep 2008CIS 560: Database System Concepts Lecture 13 of 42 Friday, 26 September 2008.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
DatabaseIM ISU1 Fundamentals of Database Systems Chapter 3 Data Modeling Using Entity-Relationship Model.
1 Database Systems Entity Relationship (E-R) Modeling.
Entity-Relationship Modeling. 2 Entity Type u Entity type –Group of objects with same properties, identified by enterprise as having an independent existence.
Logical Database Design and the Relational Model.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Chapter 8 Entity-Relationship Modeling Pearson Education © 2009.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Mapping Constraints Keys.
Chapter 2: Entity-Relationship Model. 3.2 Chapter 2: Entity-Relationship Model Design Process Modeling Constraints E-R Diagram Design Issues Weak Entity.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts - 6 th Edition Chapter 7: Entity-Relationship Model.
Lecture 4: Logical Database Design and the Relational Model 1.
Logical Design 12/10/2009GAK1. Learning Objectives How to remove features from a local conceptual model that are not compatible with the relational model.
McGraw-Hill/Irwin Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Modeling the Data: Conceptual and Logical Data Modeling.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
The Relational Model Lecture #2 Monday 21 st October 2001.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
ENTITY-RELATIONSHIP MODELLING. Objectives: How to use Entity–Relationship (ER) modelling in database design. Basic concepts associated with ER model.
Wenyue Du, Mong Li Lee, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore {duwenyue, leeml,
Conceptual Modeling for XML Data
COP Introduction to Database Structures
Presentation transcript:

1 The ORA-SS Approach for Designing Semistructured Databases Xiaoying Wu, Tok Wang Ling, Mong Li Lee National University of Singapore Gillian Dobbie University of Auckland, New Zealand

2 Outline 1. Motivation 2. Introduction to ORA-SS (Object-Relationship- Attribute ) Model 3. From ORA-SS to XML DTD 4. Normal form for ORA-SS schema diagram 5. Designing ORA-SS schema diagram into normal form 6. Comparison with related proposals 7. Summary

3 1. Motivation 4 Example 1.1: Redundancy in XML document cs 12 Smith 230 Database 22 Jones 230 Database

4 1. Motivation ( Cont. ) 4 Example 1.1 ( Cont. )

5 1. Motivation ( Cont. ) 4 Example 1.1 ( Cont. ) Corresponding ORA-SS instance diagram and schema diagram

6 1. Motivation ( Cont. ) 4 Example 1.1 (Cont.) A better Designed ORA-SS schema diagram

7 4 Example 1.1 (Cont.) 1. Motivation ( Cont. ) A better Designed ORA-SS instance schema diagram

8 1. Motivation ( Cont. ) 4 Example 1.2:Ambiguity in OEM database and its DataGgide

9 1. Motivation ( Cont. ) 4 Example 1.2(Cont.) :Ternary Relationship Type Representation

10 1. Motivation ( Cont. ) 4 Example 1.2 ( Cont. ): Binary Relationship Type Representation Note the DataGuide for the schema diagram is the same as for the previous schema!

11 2. Introduction to ORA-SS Model 4 Four concepts: 4 object classes 4 relationship types 4 attributes 4 references 4 Four Diagrams: 4 schema diagram 4 instance diagram 4 functional dependency diagram 4 inheritance diagram

12 2. Introduction to ORA-SS Model( Cont. ) 4 Object Class –attributes of object class Single valued Multi-valued –ordering on object class Object class employee with attributes in an ORA-SS schema diagram

13 2. Introduction to ORA-SS Model( Cont. ) 4 Relationship Type –attributes of relationship type Single valued Multi-valued –degree of n-ary relationship type –participation constraints of objects in relationship type –disjunctive relationship type –recursive relationship type

14 2. Introduction to ORA-SS Model( Cont. ) 4 Relationship type (Cont.) Representing binary relationship type

15 2. Introduction to ORA-SS Model( Cont. ) 4 Relationship type (Cont.) Representing ternary relationship type

16 2. Introduction to ORA-SS Model( Cont. ) 4 Attributes –key attribute and identifier –composite attribute –disjunctive attribute –attribute with unknown structure (ANY) –ordering on attribute –Attributes of object class/relationship type –Single-valued / multi-valued attribute –fixed and default values of attribute –derived attribute

17 2. Introduction to ORA-SS Model( Cont. ) 4 Attributes (Cont.) Object classes with relationship type and attributes in an ORA-SS schema diagram

18 4 Attributes (Cont.) 2. Introduction to ORA-SS Model( Cont. ) Disjunctive attribute and relationship in an ORA-SS schema diagram

19 2. Introduction to ORA-SS Model( Cont. ) 4 References Referencing an object class in an ORA-SS schema diagram

20 2. Introduction to ORA-SS Model( Cont. ) Recursive relationship type in an ORA-SS schema diagram Symmetric relationship sets in an ORA-SS schema diagram 4 References (Cont.)

21 3. Mapping ORA-SS schema diagram to XML DTD Algorithm 1: Mapping ORA-SS Schema Diagram to XML DTD input: an ORA-SS schema diagram SD output: an XML DTD Begin For each object class O in SD do: Step 1. sub-object classes of O. Step 2. For each attribute A of O Case (1)A is a single valued simple attribute Case (2)A is a single valued composite attribute, replace A with its components and add them to Case (3)A is a multivalued simple attribute. Case (4)A is a multivalued composite attribute, A’s components

22 4 Algorithm 1: mapping ORA-SS schema diagram to XML DTD (cont.) 3. Mapping ORA-SS schema diagram to XML DTD (Cont.) Step 3. For each relationship attribute A under O Case (1)A is a simple attribute add A to O ’s subelementsList. Case (2)A is a multi-valued simple attribute and add A to O ’s subelementsList. Case (3)A is a single-valued composite attribute. A’s components. Case (4) A is a multi-valued composite attribute. A’s components. add A to O ’s subelementsList. Step 4. For each reference O-Ref Case (1) O is a child object class of O 1, and has no extra attributes and child object classes Case (2) O is a root object class or it has nested attributes or child object classes

23 3. Mapping ORA-SS schema diagram to XML DTD (Cont.) 4 Example 3.1 Referencing an object class in an ORA-SS schema diagram

24 4 Example 3.1 (Cont.) An XML DTD for the ORA-SS schema diagram 3. Mapping ORA-SS schema diagram to XML DTD (Cont.)

25 4. Normal form for ORA-SS schema diagram 4 Observation: ORA-SS is similar to nested relations –tree-like structure –repeating groups or multiple occurrences of objects. e.g.: the corresponding nested relation for the following ORA-SS schema diagram is Dept (dept-name, course (code, title, student (number, s-name, grade)*)*)

26 4. Normal form for ORA-SS schema diagram (Cont.) 4 Objectives: To ensure the corresponding set of nested relations of the ORA-SS schema diagram is in normal form for set of nested relations (NF- NR) [5,6] We will define 4 Object class normal form (O-NF) 4 Relationship type normal form (R-NF) 4 ORA-SS normal form schema (ORA-SS NF)

27 4. Normal form for ORA-SS schema diagram (Cont.) 4 Defn: object class normal form (O-NF) An object class O of an ORA-SS schema diagram is said to be in object class normal form (O-NF), if the nested relation constructed by O’s single valued attributes as its atomic attributes, O’s multivalued attributes as its repeating groups, is in normal form NF-NR.

28 4 Example 4.1:Assume we have following functional dependencies: {S#  dept, dept  faculty} for the ORA-SS schema diagram: 4. Normal form for ORA-SS schema diagram (Cont.) The corresponding nested relation for the schema diagram is : Staff(s#,dept,faculty), it is not in 3NF, since faculty is transitive dependent on S#, hence the relation is not in NF-NR. A better Designed ORA-SS schema diagram: Transitive functional dependency is removed.

29 4. Normal form for ORA-SS schema diagram (Cont.) 4 Defn: relationship type normal form (R-NF) A relationship type R of an ORA-SS schema diagram D is said to be in relationship type normal form (R-NF), if the nested relation constructed by the identifiers of the participating object classes, and R’s atomic attributes as its atomic attributes, R’s multivalued attributes and composite attributes as its repeating groups, is in normal form NF-NR.

30 4 Example 4.2:The ORA-SS schema attempts to show that the lecturer can teach all the courses using all the textbooks as described on the curriculum, i.e. it should satisfy a MVD constraints: course-code  isbn | staff#.. The nested relation for the relationship type ctl is: ctl(course-code,isbn,staff#) It is not in 4NF, so is not in NF-NR, hence the relationship type ctl is not in R-NF. 4. Normal form for ORA-SS schema diagram (Cont.) A better design: MVD is removed

31 4. Normal form for ORA-SS schema diagram (Cont.) 4 Defn: ORA-SS normal form schema An ORA-SS schema diagram D is in normal form (NF) iff it satisfies the following conditions: 1.Every object class in D is in O-NF. 2.For every relationship type R in D (a) R is in R-NF. (b) Case(1) R is a binary relationship type from object class A to object class B, then all the B’s attributes can stay with B only if R is a one-to-many or one-to-one binary relationship type from A to B. All the attributes of R (if any) should be attached to B. Case (2) R is a n-ary relationship type with n (n>2) participating object classes O 1,O 2,…,O n, and the path going downward from the top of D linking those object classes is /O1/O2/…/O n, then for each object class O i (2  i  n), (i) O i should have an i-ary relationship R i with its ancestors O 1,O 2,…,O i-1. (ii) The attributes of O i can stay with O i only if functional dependency O i  O 1,O 2,…,O i-1 can be derived from the functional dependency diagram for D. The attributes of R i (if any) should be attached to O i. 3.There is no relationship type nested under another many-to-many or many-to one binary or n-ary (n>2) relationship type. 4.Every relationship type cannot be derived from other relationship types in D.

32 4. Normal form for ORA-SS schema diagram (Cont.) 4 Example 4.4: The ORA-SS schema diagram is not in NF, if professor is also an employee in the department: the qualification of a professor can be derived from that of employee, such information will be repeated in the underlying databases. A ORA-SS schema diagram that not in NFA ORA-SS schema diagram that in NF

33 5. Converting ORA-SS Schema Diagrams into Normal Form Two Approaches for Designing Semistructured Databases: 4 Approach 1. –based on the users’ requirements, come out an initial ORA-SS schema diagram; –normalize the ORA-SS schema diagram to its normal form; –map it to an XML DTD or XML Schema; 4 Approach 2. –Extract schema from the instances using the schema extracting techniques. –Translate the schema into ORA-SS schema diagram. Here we need semantic enrichment, since not all semantics needed are available from the extracted schema. –Convert the ORA-SS schema diagram into its normal form. –translate the NF ORA-SS schema diagram back to XML DTD or XML Schema. –Restructuring the initial data instance to conform to the generated XML DTD or XML Schema.

34 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Algorithm 2: Converting an ORA-SS schema diagram into NF ORA-SS schema diagram. Input : an ORA-SS schema diagram SD, and its functional dependency diagram. Output : a NF ORA-SS schema diagram. { step 1. Convert any non O-NF object class to O-NF. step 2. Make each relationship type R in R-NF. step 3. This step involves two sub-steps. (1) Construct diagrams for each object class with their attributes. (2) Represent each relationship type R. We make R satisfy the item (b) of condition 2 as well as condition 3 of the NF definition by introducing referencing object classes, and requiring each relationship type start with an object class with attributes (i.e., non-reference object class). step 4. Remove those relationship types along with their associated attributes that can be derived from other relationship types in the schema diagram to satisfy condition 4 of NF definition. }

35 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1: There is a many-to-many binary relationship pc between professor and course, and a many-to-many binary relationship ct between course and textbook. It is not in NF ORA-SS since it violates the condition 3 of the NF definition.. (a) Initial ORA-SS schema diagram

36 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1 (Cont.) Step 1. The three given object classes are already in O-NF. Step 2. The two relationship type pc and ct are already in R-NF. Step 3. (1) generate three diagrams for the object classes with attributes. (b) Fragment diagrams for object classes

37 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1 (Cont.) Step 3.( Cont. ) (2) represent the binary relationship pc, by creating a reference object class course 1 referencing course and nest course 1 under professor (c) Diagrams after representing relationship pc

38 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.1 (Cont.) Step 3.( Cont. ) (2) represent the binary relationship ct, by creating a reference object class textbook 1 referencing textbook and nest textbook 1 under course. Step 4.(passed). The schema generated is in NF. (d) Final ORA-SS schema diagram that in NF

39 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2. There is a binary relationship cs between course and student and a ternary relationship cst between course, student and tutor. The grade is an attribute of the binary relationship cs, and feedback is an attribute of the ternary relationship cst. It is not in NF ORA-SS since it violates the item (ii) of case 2 in condition 2-(b) of NF definition. (a) Initial ORA-SS schema diagram

40 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2( Cont.) Step 1. The three given object classes are already in O-NF. Step 2.The two relationship type cs and cst are already in R-NF. Step 3. (1) generate three diagrams for the object classes with attributes. (b) Fragment diagrams for object classes

41 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2 (Cont.) Step 3.( Cont. ) (2) represent the binary relationship cs. we create a reference object class student 1 referencing student and nest student 1 under course. Relationship attribute grade is attached to student 1. (c) Diagram representing binary relationship cs

42 5. Converting ORA-SS Schema Diagrams into Normal Form( Cont.) 4 Example 5.2 (Cont.) Step 3.(Cont.) (2) represent the relationship cst. we create a reference object class tutor 1 referencing tutor, and nest tutor 1 under student 1. Relationship attribute feedback is attached to tutor 1. Step 4.(passed). The schema generated is in NF. (d) Final ORA-SS schema diagram that in NF

43 6. Comparison with Related Proposal 4 The first attempt to define normal form for semistructured data[4] –Defines a schema called S3-Graph, a labeled graph in which vertices correspond to objects and edges represent the object-subobject relationship. Its data instance is called semistructured data graph. –S3-Graph cannot show the degree of a n-ary relationship type, neither can it distinguish between attributes of object classes and attributes of relationships types.

44 6. Comparison with Related Proposal (Cont.) 4 The first attempt to define normal form for semistructured data[4] (Cont.) –Defined a dependency constraint SS- dependency. –Proposes S3-NF. An S3-Graph is in S3-NF if there is no transitive SS-dependency. Hence, only this kind of redundancy can be recognized by S3-NF

45 6. Comparison with Related Proposal (Cont.) 4 The first attempt to define normal form for semistructured data[4] (Cont.) –Presents two approaches to design S3-NF databases 1.The decomposition method can remove identified transitive SS-dependency and achieve S3-NF, while may not able to remove the partial functional dependency inside an entity type or object classes, as well as the redundancy result from over-nesting. 2.The transformation of a normal form ER diagram into an S3-Graph. The result may not be unique but is dependent on the path constructed. Hence some results may not satisfy the application requirements and comply with the user’s viewpoints.

46 6. Comparison with Related Proposal (Cont.)  The most recent proposal: XNF (XML Normal Form) [2] –It mainly provides algorithms to translate a schema, represented in a conceptual model called CM hypergraph to a scheme-tree forest in XNF. –CM hypergraph has no concept of attribute (so too many objects) and no hierarchical structure. –The given algorithms are non-deterministic, and suffers from efficiency. –Adding new required information requires redesign schema. –The algorithms generate a large no of solutions rather than verifying whether a SS schema is in normal form or not. –ISA hierarchies are removed from CM hypergraph before input to the algorithms.

47 6. Comparison with Related Proposal (Cont.) 4 The advantages of our proposal: –2-level design: incremental and iterative First, identify or figure out object classes,and relationship types from user requirements. Then add attributes for object classes and relationship types. In contrast, XNF requires all the needed information to be presented at once. Even a small change in information requirements requires redesign the whole schema.

48 6. Comparison with Related Proposal (Cont.) 4 The advantages of our proposal (Cont.) : –Preserve the hierarchical structure satisfying users’ requirements. In contrast, since CM graph has no hierarchy, XNF needs to generate many solutions. The approach fails when user already has a hierarchical structure, and wants to preserve it and verifies the design is good or not.

49 7. Summary 4 ORA-SS model helps to detect redundancy in semistructured data. 4 We need a normal form for ORA-SS, since ORA- SS schema diagrams may contain redundancies and suffers from considerable updating anomalies. 4 We define a normal form ORA-SS schema diagram. It ensures –no unnecessary redundancy and –no updating anomalies for semistructured databases generated from the schema. 4 We present an algorithm for mapping ORA-SS schema diagram into XML DTD/Schema

50 7. Summary (Cont.) 4 We give a design methodology and present a comprehensive algorithm for normalizing an ORA-SS schema diagram into its normal form. The steps presented can also be used as guidelines for designing semistructured databases using the ORA- SS model –As ORA-SS distinguished objects Vs. attributes, the design complexity is reduced. –ORA-SS allows 2 levels of design: first object classes and relationship type then add in attributes. 4 We show that ORA-SS design approach outperform other related proposals.

51 References 1. G.Dobbie, X.Y.Wu, T.W.Ling and M.L.Lee. ORA-SS: An Object-Relationship- Attribute Model for Semistructured Data. Technical Report TR21/00, School of Computing, National University of Singapore, D.W.Embley and W.Y.Mok. Developing XML Documents with Guaranteed “Good” Properties. ER R. Goldman and J. Widom. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. Proceedings of the Twenty- Third International Conference on Very Large Data Bases, pages , Athens, Greece, August S. Y. Lee, M. L. Lee, T. W. Ling and L. A.. Kalinichenko. Designing Good Semi-structured Databases. ER 1999: T.W. Ling. A Normal Form for Entity-Relationship Diagrams. Proc. 4 th International Conference on Entity-Relationship Approach (1985) 6. T. W. Ling. A normal form for sets of not-necessarily normalized relations. In Proceedings of the 22nd Hawaii International Conference on System Sciences, pp United States: IEEE Computer Society Press, X.Y.Wu, T.W. Ling, M.L.Lee, G.Dobbie. Designing Semistructured Databases Using ORA-SS Model, in Proceedings of the 2nd International Conference on Web Information Systems Engineering (WISE), IEEE Computer Society Kyoto, Japan, December 2001.