Download presentation
Presentation is loading. Please wait.
1
Chapter 1: Data Models and DBMS Architecture Title: What Goes Around Comes Around Authors: M. Stonebraker, J. Hellerstein Pages: 2-40
2
What Goes Around Comes Around Problem –Problem Statement –Why is this problem important? –Why is this problem hard? Approaches –Approach description, key concepts –Contributions (novelty, improved) –Weaknesses
3
Problem Statement – Data Model Data Model : wikipedia entrywikipedia entry Given –A set of application domains –Data representation needs, e.g. query, integrity, manipulation Find –A representation language –A set of building-blocks Objectives –Expressiveness –Convenience, i.e. reduce semantic gap (given, find) Constraints –Usability –Performance
4
Why is this problem important? Common data model yields benefits –Informed decision making Strategies based on corporate-wide information Example: Customer relationship management –Operational efficiencies Inter/Intra-organization communication –Example: Supply chain management Reduced cost of collaboration –Scientific problem solving – Genome sequencing Lack of common data model leads to –Makes data sharing difficult Redundant and inconsistent data across applications –Hampers informed decision making, collaborations, communication, …
5
Why is this problem Hard? Changes –Set of applications evolve Business data processing (1960s) – COBOL Scientific Apps, Software development (1980s) - Objects Web (1990s) - XML Sensor networks (2000s) … –Platforms evolve Computer Hardware, Languages, Operating Systems Storage: Tapes Disks (1960s) RAID (1990s) SAN … CPUs: Mainframe Mini Desktops Multi-core CPUs (2000s) …
6
Approaches Nine waves –IMS – Hierarchical Model –CODASYL – Network Model –Relational –Entity Relationship –Relational++ –Semantic Data Model –OO –Object Relational –Semi-structured Approaches –Approach description, key concepts –Contributions (novelty, improved) –Weaknesses
7
Approaches IMS – Hierarchical Model –Constructs – record types, key, tree –Concepts – physical data independence, logical data independence –Limitations – Many to many binary relationships => duplicates CODASYL – Network Model –Constructs – record types, keys, “set” type (edge), owner, child, network, entry –Limitations – 3-way relationship, lack of physical data independence, bulk load Relational –Constructs – relations, relational algebra, functional dependency –Limitations – transitive closure Entity Relationship –Constructs – entity, relationship, attribute –Limitations – lack of query language
8
Approaches Relational++ –Constructs- Set-valued attributes, aggregation (tuple reference), generalization Semantic Data Model –Constructs: class, class variable, multiple inheritance, OO –Construct- persistent programming language, no semantic gap, swizzle –Weak support for transactions, queries Object Relational –Constructs: user defined data types, operators, functions and access methods Semi-structured –Concepts: Schema last, Complex network oriented data model –Constructs: DTD, XMLSchema, union types, Xpath
9
Lessons IMS – Hierarchical Model –1. Physical and logical data independence are desirable –2. Tree structure data models are very restrictive –3. Tree structured data => hard logical reorganization –4. Record-at-a-time interface forces manual query optimization CODASYL – Network Model –5. Networks are more flexible and more complex than trees –6. Loading and recovering networks is more complex than trees Relational –7. Set-a-time language provide improved physical data independence –8. Logical data independence is easier with a simpler data model –9. Technical debates are usually settled by marketplace –10. Query optimizers can beat record-at-a-time programs
10
Lessons Entity Relationship –11. Functional dependencies are difficult to understand. Relational++ –12. Without big performance or functionality advantages, new construct will go nowhere. Semantic Data Model OO –13. Packages will not sell to users without “major pain” –14. Persistent languages will not succeed w/o help from programming language community Object Relational –15. Putting code in DBMS, user-defined access methods –16. Wide-spread adoption = f (standard, market forces) Semi-structured –17. Schema-last is probably a niche market –18. XQuery = OR SQL with different syntax –19. Semantic heterogeneity >> XML
11
Lessons IMS – Hierarchical Model –1. Physical and logical data independence are desirable –2. Tree structure data models are very restrictive –3. Tree structured data => hard logical reorganization –4. Record-at-a-time interface forces manual query optimization CODASYL – Network Model –5. Networks are more flexible and more complex than trees –6. Loading and recovering networks is more complex than trees Relational –7. Set-a-time language provide improved physical data independence –8. Logical data independence is easier with a simpler data model –9. Technical debates are usually settled by marketplace –10. Query optimizers can beat record-at-a-time programs Entity Relationship Relational++ Semantic Data Model OO Object Relational Semi-structured
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.