Data Structure: Data Modeling or XML? Metatopia 2007 Washington, D.C. November 6, 2007 David C. Hay Essential Strategies, Inc. 13 Hilshire Grove Lane,

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management Tenth Edition
Advertisements

Microsoft Excel 2003 Illustrated Complete Excel Files and Incorporating Web Information Sharing.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
CHAPTER 7 Roderick Dickson Kelli Grubb Tracyann Pryce Shakita White.
Managing Data Resources
1 Describing the World: Data Model Patterns Part Two: Metadata Essential Strategies, Inc. 13 Hilshire Grove Lane, Houston, TX  (713) 
9/6/2001Database Management – Fall 2000 – R. Larson Information Systems Planning and the Database Design Process University of California, Berkeley School.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 2 Introduction to Database Development.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 2 Introduction to Database Development.
Chapter 6 Methodology Conceptual Databases Design Transparencies © Pearson Education Limited 1995, 2005.
Fundamentals, Design, and Implementation, 9/e Chapter 3 Entity-Relationship Data Modeling: Process and Examples Instructor: Dragomir R. Radev Fall 2005.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 5-1 David M. Kroenke Database Processing Tenth Edition Chapter 5 Data.
Entity-Relationship Model and Diagrams (continued)
© 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke
4/16/2007Declare a Schema File I1. 4/16/2007Declare a Schema File I2 Declare a Schema File A collection of semantic validation rules designed to constrain.
Fundamentals, Design, and Implementation, 9/e COS 346 Day 2.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Chapter 3: Data Modeling
Fundamentals, Design, and Implementation, 9/e COS 346 Day 3.
Chapter 4 Entity Relationship (E-R) Modeling
Mgt 20600: IT Management & Applications Databases Tuesday April 4, 2006.
Introduction to XML This material is based heavily on the tutorial by the same name at
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Chapter 2 Introduction to Database Development Database Processing David M. Kroenke © 2000 Prentice Hall.
Logical Database Design Nazife Dimililer. II - Logical Database Design Two stages –Building and validating local logical model –Building and validating.
Trisha Cummings.  Most people involved in application development follow some kind of methodology.  A methodology is a prescribed set of processes through.
IT 244 Database Management System Data Modeling 1 Ref: A First Course in Database System Jeffrey D Ullman & Jennifer Widom.
DeSiamorewww.desiamore.com/ifm1 Database Management Systems (DBMS)  B. Computer Science and BSc IT Year 1.
Copyright © 2009 David C. Hay 1 Converting An Essential Entity/Relationship Model Into A Real Database Design Enterprise Data World David Hay Tampa, Florida.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Chapter 4 The Relational Model.
Web-Enabled Decision Support Systems
ITEC224 Database Programming
Database Systems: Design, Implementation, and Management Ninth Edition
Methodology Conceptual Databases Design
National Institute of Standards and Technology Technology Administration U.S. Department of Commerce 1 Patient Care Devices Domain Test Effort Integrating.
1 ER Modeling BUAD/American University Entity Relationship (ER) Modeling.
Information Systems: Databases Define the role of general information systems Describe the elements of a database management system (DBMS) Describe the.
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 2/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Categories of Vocabulary Compatibility Dmitry Lenkov Oracle.
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 3/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
1-1 System Development Process System development process – a set of activities, methods, best practices, deliverables, and automated tools that stakeholders.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Methodology - Conceptual Database Design
Next Back A-1 Management Information Systems for the Information Age Second Canadian Edition Copyright 2004 The McGraw-Hill Companies, Inc. All rights.
C-1 Management Information Systems for the Information Age Copyright 2004 The McGraw-Hill Companies, Inc. All rights reserved Extended Learning Module.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
Msigwaemhttp//:msigwaem.ueuo.com/1 Database Management Systems (DBMS)  B. Computer Science and BSc IT Year 1.
Essential Strategies, Inc. Global Justice Entity Relationship Model: A Conceptual Entity Relationship Model Presented to: The Data Architecture Subcommittee.
Database Environment Session 2 Course Name: Database System Year : 2013.
Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  The concept of Data, Information and Knowledge  The fundamental terms:  Database and database system  Database.
Information Architecture The Open Group UDEF Project
1 Chapter 2 Database Environment Pearson Education © 2009.
Methodology - Logical Database Design. 2 Step 2 Build and Validate Local Logical Data Model To build a local logical data model from a local conceptual.
Welcome: To the fifth learning sequence “ Data Models “ Recap : In the previous learning sequence, we discussed The Database concepts. Present learning:
1 © 2013 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Copyright © 2014 Pearson Canada Inc. 5-1 Copyright © 2014 Pearson Canada Inc. Application Extension 5a Database Design Part 2: Using Information Technology.
Methodology Conceptual Databases Design
Implementing the Surface Transportation Domain
Object Management Group Information Management Metamodel
Methodology Conceptual Database Design
Chapter 2 Database Environment.
Methodology Conceptual Databases Design
Presentation transcript:

Data Structure: Data Modeling or XML? Metatopia 2007 Washington, D.C. November 6, 2007 David C. Hay Essential Strategies, Inc. 13 Hilshire Grove Lane, Houston, TX  (713)  

/34 Copyright (c) 2007, Essential Strategies, Inc. 2 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An Examination  An Alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 3 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An Examination  An Alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 4 The heart of the problem... 1.U.S. Department of Justice. “Building Exchange Content Using the Global Justice XML Data Model: A User Guide for Practitioners and Developers”. June, p. v. (Available at: “Accurate and germane sharing of informa- tion across jurisdictions is a critical issue for justice and public safety. Although there has been significant progress in the field of infor- mation technology, the lack of standards for exchanging justice data has not only been a major obstacle to, but also the principal rea- son for, the high costs involved with justice information exchange.”1

/34 Copyright (c) 2007, Essential Strategies, Inc. 5 According to the Justice Department...  Sharing requires standards for exchanging data.  For this reason, they have developed the Global XML Data Model – in XML.  But sharing involves more than data exchange.  It means agreeing on semantics.  This means understanding the meaning of the data being exchanged... Not just the form and syntax. This calls for something more...

/34 Copyright (c) 2007, Essential Strategies, Inc. 6 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 7 Four ways to look at data External Schema 1 External Schema 2 External Schema 3 Conceptual Schema Internal Schema Internal Schema Physical Schema Physical Schema Logical Schema (Relnl.) Logical Schema (XML)

/34 Copyright (c) 2007, Essential Strategies, Inc. 8 Four ways to look at data External Schema 1 External Schema 2 External Schema 3   External Schema In particular viewer’s terms Overlapping May be inconsistent, at least in use of words. Difficult to reconcile different views

/34 Copyright (c) 2007, Essential Strategies, Inc. 9 Four ways to look at data Conceptual Schema Conceptual Schema Reconciles different external schema into one view. Reflects logical structure of data

/34 Copyright (c) 2007, Essential Strategies, Inc. 10 Four ways to look at data Internal Schema (Relnl.) Internal Schema (XML) Logical Schema (Relnl.) Logical Schema (XML)   Internal Schema Logical Schema – design in terms of a DBMS: tables and columns, object classes, etc. Physical Schema -- design of physical medium: cylinders, tablespaces, etc. Physical Schema Physical Schema

/34 Copyright (c) 2007, Essential Strategies, Inc. 11 Physical Schema Physical Schema In terms of the Architecture Framework... External Schema 1 External Schema 2 External Schema 3 Conceptual Schema Logical Schema (Relnl.) Business owners’ views Architect’s view Logical Schema (XML) Designers’ views Builders’ views

/34 Copyright (c) 2007, Essential Strategies, Inc. 12 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 13 The Conceptual Model (Schema)... External Schema 1 External Schema 2 External Schema 3 Conceptual Schema Logical Schema (Relnl.) Logical Schema (XML) Physical Schema Physical Schema

/34 Copyright (c) 2007, Essential Strategies, Inc. 14 The Conceptual Entity / Relationship Model...  Addresses the semantics of the organization.  Consists of assertions about the nature of the enterprise.  Is graphic, so it can be discussed with the business community.

/34 Copyright (c) 2007, Essential Strategies, Inc. 15 For example...

/34 Copyright (c) 2007, Essential Strategies, Inc. 16 Note the semantics... primarily about BOOKTOPIC addressed in Each BOOK must be primarily about one and only one TOPIC. Each TOPIC may be addressed in one or more BOOKS. For example...

/34 Copyright (c) 2007, Essential Strategies, Inc. 17 Ok, it’s true...  Many data modelers don’t care about semantics.  They only care about database design (the technical solution).  Just as XML advocates are promoting a technical solution. But semantics comes first!

/34 Copyright (c) 2007, Essential Strategies, Inc. 18 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 19 What is XML?

/34 Copyright (c) 2007, Essential Strategies, Inc. 20 XML is a kind of internal (Logical) schema... External Schema 1 External Schema 2 External Schema 3 Conceptual Schema Logical Schema (Relnl.) Logical Schema (XML) Physical Schema Physical Schema Designers’ view

/34 Copyright (c) 2007, Essential Strategies, Inc. 21 XML: The 1 Minute tutorial...  XML is a language for data communications.  It is based on tags defined by the creator. For example: BlackBerry  Notes: Each tag provides a label for that which follows it. Each tag must be accompanied by an end tag (</…) Tags are defined by a community within which communications are to take place. “All we have to do is to agree on the tags…”

/34 Copyright (c) 2007, Essential Strategies, Inc. 22 XML Schema: The 1 minute tutorial...  XML Schema is an XML document that defines the tags to configure other documents.  The “tags” are predefined in a WW3 namespace: Xs:schema xmins:xs=“  Some of the tags include:  Key attributes of and : Name=“Chuck” Type=“xs:string” Use=“required” MinOccurs=“0” MaxOccurs=“unlimited”

/34 Copyright (c) 2007, Essential Strategies, Inc. 23 The XML Schema version of our model... <xs:complexType><xs:sequence> <xs:complexType><xs:sequence> <xs:complexType><xs:sequence> <xs:complexType> </xs:complexType></xs:element> <xs:complexType><xs:sequence> <xs:complexType>

/34 Copyright (c) 2007, Essential Strategies, Inc. 24 Of course one page isn’t enough... </xs:simpleContent></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence> </xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:schema>

/34 Copyright (c) 2007, Essential Strategies, Inc. 25 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 26 <xs:complexType><xs:sequence> <xs:complexType><xs:sequence> <xs:complexType><xs:sequence> <xs:complexType> </xs:complexType></xs:element> <xs:complexType><xs:sequence> <xs:complexType> Ok, Let’s look at that again... Entity classes Relationship cardinality (Default: must be 1) Attributes Attribute optionality

/34 Copyright (c) 2007, Essential Strategies, Inc. 27 </xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence> </xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:schema> And, of course, the rest of it... You may not have noticed, but this is an attribute of WORKSHEET

/34 Copyright (c) 2007, Essential Strategies, Inc. 28 XML Spy does provide a graphic... Optional element Must be one or more elements

/34 Copyright (c) 2007, Essential Strategies, Inc. 29 But wait! There’s more to the model! In the data model, these were attributes of WORKBOOK

/34 Copyright (c) 2007, Essential Strategies, Inc. 30 More about Styles...

/34 Copyright (c) 2007, Essential Strategies, Inc. 31 Not to mention Worksheet Options, etc...

/34 Copyright (c) 2007, Essential Strategies, Inc. 32 Please forgive me for not showing you the six pages of resulting XML...

/34 Copyright (c) 2007, Essential Strategies, Inc. 33 What does this mean to the data model?

/34 Copyright (c) 2007, Essential Strategies, Inc. 34 Including some very strange things...

/34 Copyright (c) 2007, Essential Strategies, Inc. 35 The data model was inferred from the XML  Is it right? 2. My thanks to Peter Aiken for this example. 2

/34 Copyright (c) 2007, Essential Strategies, Inc. 36 Let’s look at the model again... Is not an EXCEL workbook a workbook? (Sub-type?) What is this relationship? Which column for this cell? Attributes of authors, company? What is this? … or this? Attributes of WORKBOOK? More authors?

/34 Copyright (c) 2007, Essential Strategies, Inc. 37 XML limitations...  XML is fundamentally hierarchical... Cannot have multiple parents  Can describe a transaction, but... Assumes validity of data  Can impose cardinality rules  Can impose syntactic rules  No rules based on the meaning of the data.  Cannot describe semantics of relationships  Cannot be presented to normal human beings.

/34 Copyright (c) 2007, Essential Strategies, Inc. 38 In Fairness...  While E/R modeling can more effectively portray the meaning of the data, it too is limited in its ability to portray business rules.  New tools (such as XML Spy) are making it possible to deal with XML graphically.

/34 Copyright (c) 2007, Essential Strategies, Inc. 39 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 40 Add network... With Data Modeling, we have an alternative......and a sub-type?...many-to-many......reference entity classes…...collapsed entities... Border position Number format Font Color Vertical alignment etc.

/34 Copyright (c) 2007, Essential Strategies, Inc. 41 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 42 To convert a data model to XML...  Identify a hierarchy. NOTE: Several may be available.  Establish one direction for many-to-many relationships.  Move intersect attributes to new “many” side.  Inherit attributes from parents to children entity classes.

/34 Copyright (c) 2007, Essential Strategies, Inc. 43 Here’s one version...

/34 Copyright (c) 2007, Essential Strategies, Inc. 44 And create a constrained model...

/34 Copyright (c) 2007, Essential Strategies, Inc. 45 The Resulting XML (the graphic version, at least)...

/34 Copyright (c) 2007, Essential Strategies, Inc. 46 More about the Worksheet...

/34 Copyright (c) 2007, Essential Strategies, Inc. 47 Here’s a second version... Only primary author is included

/34 Copyright (c) 2007, Essential Strategies, Inc. 48 And create a constrained model...

/34 Copyright (c) 2007, Essential Strategies, Inc. 49 The Resulting XML...

/34 Copyright (c) 2007, Essential Strategies, Inc. 50 With Worksheet Details...

/34 Copyright (c) 2007, Essential Strategies, Inc. 51 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 52 Federal Enterprise Architecture CIO Council, Office of Management and Budget. 3

/34 Copyright (c) 2007, Essential Strategies, Inc. 53 Data Sharing Data Context Query Points and Exchange Packages (XML) Taxonomies (Categories) (Function/data Usage) Data Description Data Elements (E/R Model) Specifically, the Data Reference Model...

/34 Copyright (c) 2007, Essential Strategies, Inc. 54 Data Sharing Data Context Query Points and Exchange Packages (XML) Taxonomies (Categories) (Function/data Usage) Data Description Data Elements (E/R Model) But you must understand when you are not doing this but this.

/34 Copyright (c) 2007, Essential Strategies, Inc. 55 Agenda  Case Study: The Justice Department  Four ways to look at data  The Conceptual Entity/Relationship Model  The XML Version  An examination  An alternative Model  The Revised XML  The Federal Data Reference Model  Conclusions

/34 Copyright (c) 2007, Essential Strategies, Inc. 56 Conclusions...  XML is very good for data communications. English syntax is convenient. XML Schema is very powerful for describing the structure of transactions It is widely accepted. Graphic tools make it more manageable.  Semantic data modeling is better for analyzing data structure. Graphic nature makes it suitable for discussing semantic issues. Two dimensional format makes it possible to describe networks.

/34 Copyright (c) 2007, Essential Strategies, Inc. 57 More significantly...  But XML is fundamentally a technological design. ...while conceptual data modeling is fundamen- tally a way to describe the business problem. It’s important to understand the difference.

/34 Copyright (c) 2007, Essential Strategies, Inc. 58 It’s better to start with the Data Model (More Semantics) And then derive the XML script from that (Less Semantics)

/34 Copyright (c) 2007, Essential Strategies, Inc. 59 Questions?