The Forest and the Trees Julia Stoyanovich Candidacy Exam in Database Systems Fall 2005
October 31, 2005 Preparing to Study for the Exam reading list
October 31, 2005 Preparing to Study for the Exam short reading list length long > 50 pages
October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic long > 50 pages Query Processing Views XML …
October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic interesting long > 50 pages Query Processing Views XML yes so-so …
October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic interestingage long > 50 pages Query Processing Views XML yes so-so same as I after 1990 …
October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic interestingageby IBM Research or mentions System R long > 50 pages Query Processing Views XML yes so-so same as I after 1990 yesno …
October 31, 2005 The Enchanted Forest query processing transaction systems semi-structured, web integration,OLAP,views
October 31, 2005 Goals database system generalextensible elegantefficient expressive system design data model simple processing capabilities
October 31, 2005 Goals database system generalextensible elegantefficient expressive system design data model simple processing capabilities
October 31, 2005 Goals database system generalextensible elegantefficient expressive system design data model simple processing capabilities
October 31, 2005 Trade-offs system or component extensible simple efficient expressive general elegant
October 31, 2005 Trade-offs system or component simple efficientgeneral elegantextensible expressive design
October 31, 2005 Trade-offs system or component simple efficientgeneral elegantextensible expressive design careful designcompromise adaptability theoretical foundations
October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability
October 31, 2005 Materialized Views query rewriting
October 31, 2005 Materialized Views query rewriting domain query optimization data integration
October 31, 2005 Materialized Views query rewriting domainassumptions closed-world open-world query optimization data integration
October 31, 2005 Materialized Views query rewriting propertiesdomainassumptions closed-world open-world contained equivalent maximally-contained query optimization data integration
October 31, 2005 Materialized Views query rewriting propertiesdomainassumptions closed-world open-world complexity algorithm output bucket inverse rules MiniCon query reformulation execution plan transformational System R style contained equivalent maximally-contained query optimization data integration
October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability
October 31, 2005 System R system design
October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming stats catalogue
October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming code generation stats catalogue
October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming code generation stats catalogue locking isolation levels hierarchy of locks
October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming code generation stats catalogue locking logging & recovery in-place recovery logical logging isolation levels hierarchy of locks
October 31, 2005 Google system design
October 31, 2005 Google system design ranking metric IRPageRank term frequency inverse document frequency link structure
October 31, 2005 Google system design ranking metric distributed crawlers IRPageRank term frequency inverse document frequency link structure
October 31, 2005 Google system design indexing infrastructure ranking metric distributed crawlers IRPageRank term frequency inverse document frequency link structure compression inverted files custom file system
October 31, 2005 Google system design indexing infrastructure ranking metric distributed crawlers IRPageRank term frequency inverse document frequency link structure compression inverted files custom file system parallelism
October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability
October 31, 2005 Common Trade-offs resource spacehardware diskmemory timeCPU
October 31, 2005 Common Trade-offs resource spacehardware indexes views diskmemory timeCPU
October 31, 2005 Common Trade-offs resource spacehardware indexes views inversion algorithms diskmemory time hashing sorting CPU
October 31, 2005 Common Trade-offs resource spacehardware indexes views inversion algorithms diskmemory time hashing sorting CPU compression
October 31, 2005 Common Trade-offs resource spacehardware indexes views inversion algorithms diskmemory time hashing sortingparallel algorithms CPU compression
October 31, 2005 XML Storage Strategies clustering storage strategy
October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager strategy
October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager file strategy
October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager DTDfileedgeattribute strategy
October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager DTDfileedgeattribute strategy
October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager DTDfileedgeattribute light-weight objects strategy
October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability
October 31, 2005 Garlic system architecture
October 31, 2005 Garlic wrapper system architecture middleware
October 31, 2005 Garlic wrapper repository modeling method invocation query processing query planning plan translation query execution execution plan cost info system architecture middleware
October 31, 2005 Memory-Conscious Algorithms calibrator
October 31, 2005 Memory-Conscious Algorithms calibrator measure TLB cache sizecost of a miss
October 31, 2005 Memory-Conscious Algorithms calibrator measure TLB optimize localityCPU parallelism cache sizecost of a miss
October 31, 2005 Adaptive Query Optimization run-time activities statistics maintenance query re-optimization adjust selectivities discover correlations re-plan choose among concurrent plans
October 31, 2005 Conclusion database systems problems large datasets complex environments trade-offs
October 31, 2005 Conclusion database systems problemssolutions large datasets complex environments trade-offs theoretical foundations careful design compromise adaptability
October 31, 2005 Thank you!
October 31, 2005 Data Models data structured semi-structured unstructured homogeneous heterogeneous structure consistencysize central distributed location
October 31, 2005 Query Optimization access path selection table scan index lookup cost estimation CPU I/O communication plan enumeration operator selection sort- based hash- based tasks of a query optimizer physical properties order delegation of processing to all nodes to a subset of nodes