ORDB Implementation Discussion. Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. From RDB to ORDB Issues to address when adding OO.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Chapter 10: Designing Databases
Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Implementation of Other Relational Algebra Operators, R. Ramakrishnan and J. Gehrke1 Implementation of other Relational Algebra Operators Chapter 12.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Xyleme A Dynamic Warehouse for XML Data of the Web.
RDB – OODB - ORDB Comparison
ORDB Implementation Discussion. From RDB to ORDB Issues to address when adding OO extensions to DBMS system.
Physical Database Monitoring and Tuning the Operational System.
RDB – OODB - ORDB Comparison. RDB – what’s good ? – Simple data model (but less versatile) – Simpler for user to learn (OO programmer?) – Easier to optimize.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Storage and.
Structured Data Types and Encapsulation Mechanisms to create new data types: –Structured data Homogeneous: arrays, lists, sets, Non-homogeneous: records.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapter 15.
8-1 Outline  Overview of Physical Database Design  File Structures  Query Optimization  Index Selection  Additional Choices in Physical Database Design.
ORDB Implementation Discussion. From RDB to ORDB Issues to address when adding OO extensions to DBMS system.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
RIZWAN REHMAN, CCS, DU. Advantages of ORDBMSs  The main advantages of extending the relational data model come from reuse and sharing.  Reuse comes.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Access Path Selection in a Relational Database Management System Selinger et al.
Chapter 16 Methodology – Physical Database Design for Relational Databases.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations: Other Operations Chapter 14 Ramakrishnan & Gehrke (Sections ; )
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
1 CS457 Object-Oriented Databases Chapters as reference.
Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle.
Methodology – Physical Database Design for Relational Databases.
Object Oriented Database By Ashish Kaul References from Professor Lee’s presentations and the Web.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Query Optimization Chapter 13.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
CPSC 404, Laks V.S. Lakshmanan1 Overview of Query Evaluation Chapter 12 Ramakrishnan & Gehrke (Sections )
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
The Object-Oriented Database System Manifesto Malcolm Atkinson, François Bancilhon, David deWitt, Klaus Dittrich, David Maier, Stanley Zdonik DOOD'89,
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Chapter 12 Outline Overview of Object Database Concepts Object-Relational Features Object Database.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Completeness Criteria for Object- Relational Database Systems by Won Kim April 2002 Sang Ho Lee School of Computing, Soongsil University
Practical Database Design and Tuning
Module 11: File Structure
Chapter 15 QUERY EXECUTION.
Database management concepts
Physical Database Design
Practical Database Design and Tuning
Instructor 彭智勇 武汉大学软件工程国家重点实验室 电话:
Database management concepts
Query Execution Presented by Jiten Oswal CS 257 Chapter 15
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Evaluation of Relational Operations: Other Techniques
Query Optimization.
ORDB Implementation Discussion
Presentation transcript:

ORDB Implementation Discussion

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. From RDB to ORDB Issues to address when adding OO extensions to DBMS system

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Layout of Data Deal with large data types : ADTs/blobs special-purpose file space for such data, with special access methods Large fields in one tuple : One single tuple may not even fit on one disk page Must break into sub-tuples and link via disk pointers Flexible layout : constructed types may have flexible sized sets,, e.g., one attribute can be a set of strings. Need to provide meta-data inside each type concerning layout of fields within the tuple Insertion/deletion will cause problems when contiguous layout of ‘tuples’ is assumed

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Layout of Data More layout design choices (clustering on disk): Lay out complex object nested and clustered on disk (if nested and not pointer based) Where to store objects that are referenced (shared) by possibly several other and different structures Many design options for objects that are in a type hierarchy with inheritance Constructed types such as arrays require novel methods, like array chunking into (4x4) subarrays for non-continuous access

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Objects/OIDs OID generation : uniqueness across time and system Object reference handling : must avoid dangling references semantics for object manipulation for shared objects

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. ADTs Type representation: size/storage Type access : import/export Type manipulation: special methods to serve as filter predicates and join predicates Special-purpose index structures : efficiency

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. ADTs Mechanism to add index support along with ADT: External storage of index file outside DBMS Provide “access method interface” a la: Open(), close(), search(x), retrieve-next() Plus, statistics on external index Or, generic ‘template’ index structure Generalized Search Tree (GiST) – user-extensible Concurrency/recovery provided

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Query Processing Query Parsing : Type checking for methods Subtyping/Overriding Query Rewriting: May translate path expressions into join operators Deal with collection hierarchies (UNION?) Indices or extraction out of collection hierarchy

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Query Optimization Core New algebra operators must be designed : such as nest, unnest, array-ops, values/objects, etc. Query optimizer must integrate them into optimization process : New Rewrite rules New Costing New Heuristics

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Query Optimization Revisited Existing algebra operators revisited : SELECT Where clause expressions can be expensive So SELECT pushdown may be bad heuristic

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Selection Condition Rewriting EXAMPLE: (tuple.attribute < 50) Only CPU time (on the fly) (tuple.location OVERLAPS lake-object) Possibly complex CPU-heavy computations May Involve both IO and CPU costs State-of-art: consider reduction factor only Now, we must consider both factors: Cost factor : dramatic variations Reduction factor: unrelated to cost factor

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Operator Ordering op1 op2

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Ordering of SELECT Operators Cost factor : dramatic variations Reduction factor: orthogonal to cost factor We want: maximal reduction and minimal cost Rank ( operator ) = (reduction) * ( 1/cost ) Order operators by increasing ‘rank’ High rank (good) -> low in cost, and large reduction Low rank (bad) -> high in cost, and small reduction

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Access Methods ( on what ?) Indexes that are ADT specific Indexes on navigation path Indexes on methods, not just on columns Indexes over collection hierarchies (trade- offs) Indexes for new WHERE clause expressions not just =, ; but also “overlaps”

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Registering New Index (to Optimizer) What WHERE conditions it supports Estimated cost for “matching tuple” Given by index designer (user?) Monitor statistics; even construct test plans Estimation of reduction factors/join factors: Register auxiliary function to estimate factor Provide simple defaults Estimation of method costs (~IO/CPU)

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Methods Dynamic linking of methods (outside DB) Overwriting methods for type hierarchy Use of “methods” with implied semantics Incorporation of methods into query process : termination? “untrusted” methods : methods corrupt server or modify DB content (side effects) Handling of “untrusted” methods : restrict language; interpret vs compile, separate address space as DB server

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Query Optimization with Methods Estimation of “costs” of method predicates Optimization of Method execution: Similar idea as handling correlated nested subqueries; must recognize repetition and rewrite physical plan. Provide some level of precomputation and reuse Optimization of Method execution: 1. If called on same input, cache that one result 2. If on full column, presort column first (groupby) 3. Or, precompute results of methods for each possible value in domain; and put in hash-table : fct (val ); Look up in hash-table during query processing or even join with it, instead of recomputing : val  fct (val)

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Query Processing User-defined aggregate functions: E.g., “second largest” or “second yellowest” Distributive aggregates: incremental computation Provide: Initialize(): set up state space Iterate(): per tuple update the state Terminate(): compute final result based on state; and cleanup state For example : “second largest” Initialize(): 2 fields Iterate(): per tuple compare numbers Terminate(): remove 2 fields

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Following Disk Pointers? Complex object structures with object pointers may exist (~ disk pointers) Navigate complex object into memory for a long-running transaction like in CAD design What to do about “pointers” between subobjects or related objects ? Swizzle = replace OIDs dereferences by in-memory pointers, and unswizzle back at end. Issues : In-memory table of OIDs and their state; indicate in each object pointer via a bit. Different policies for swizzling: on access, attached to object brought in, etc.

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Models of Persistence Different models of persistence for OODB implementations: Parallel type systems: E.g., int and dbint User must make decision at object creation time Allow for user control by “casting” types Persistence by container management: Objects must be placed into “persistent containers” such as relations in order to stay around Eg., Insert o into Collection MyBooks; Could be rather dynamic control without casting Persistence by reachability : Use global variable names to objects and structures Objects being referenced by other objects that are reachable by application, they by transitivity are also persistent. need garbage collection

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Summary A lot of work to get there: From physical database design/layout issues up to logical query optimizer extensions ORDB: reuses existing implementation base and incrementally adds new features on (but relation is first-class citizen)