1 Distributed Databases Review CS347 June 6, 2001.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Choosing an Order for Joins
Copyright © 2004 Pearson Education, Inc.. Chapter 15 Algorithms for Query Processing and Optimization.
CS4432: Database Systems II
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
CS 540 Database Management Systems
Enterprise Systems Distributed databases and systems - DT
1 Overview of Query Evaluation Chapter Objectives  Preliminaries:  Core query processing techniques  Catalog  Access paths to data  Index matching.
Distributed Databases: Review May 2003Yangjun Chen1 Distributed Databases System Architecture Distributed Database Design Semantic Data Control Distributed.
CS 347Notes 021 CS 347: Parallel and Distributed Data Management Notes02: Distributed DB Design Hector Garcia-Molina.
5/14-5/16, 2007RSFDGrC High Frequent Value Reduct in Very Large Databases Tsau Young Lin San Jose State University, USA Jianchao Han California.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
1 Distributed Databases CS347 Lecture 14 May 30, 2001.
Institut für Scientific Computing – Universität WienP.Brezany Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
CS 347Notes 041 CS 347: Distributed Databases and Transaction Processing Notes04: Query Optimization Hector Garcia-Molina.
1 Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Advanced Database Technology March 25, 2004 QUERY COMPILATION II Lecture based on [GUW,
Cost-Based Plan Selection Choosing an Order for Joins Chapter 16.5 and16.6 by:- Vikas Vittal Rao ID: 124/227 Chiu Luk ID: 210.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Lecture 5 on Query Optimization
L Distributed Query Optimization Algorithms -- 1 Distributed Query Optimization Algorithms v System R and R* v Hill Climbing and SDD-1.
CS 347Notes 031 CS 347: Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina.
1 Distributed Databases CS347 Lecture 13 May 23, 2001.
...Looking back Why use a DBMS? How to design a database? How to query a database? How does a DBMS work?
Escaping local optimas Accept nonimproving neighbors – Tabu search and simulated annealing Iterating with different initial solutions – Multistart local.
Relational Model & Relational Algebra. 2 Relational Model u Terminology of relational model. u How tables are used to represent data. u Connection between.
CSC271 Database Systems Lecture # 30.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
DISTRIBUTED DATABASE DESIGN
Chapter 2 Adapted from Silberschatz, et al. CHECK SLIDE 16.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
CPS216: Advanced Database Systems Notes 08:Query Optimization (Plan Space, Query Rewrites) Shivnath Babu.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Copyright © Curt Hill Query Evaluation Translating a query into action.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
DDBMS Distributed Database Management Systems Fragmentation
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Query Processing CS 405G Introduction to Database Systems.
1 Distributed Databases architecture, fragmentation, allocation Lecture 1.
CS 540 Database Management Systems
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
CS742 – Distributed & Parallel DBMSPage 2. 1M. Tamer Özsu Outline Introduction & architectural issues  Data distribution  Fragmentation  Data Allocation.
CS742 – Distributed & Parallel DBMSPage 3. 1M. Tamer Özsu Outline Introduction & architectural issues Data distribution  Distributed query processing.
Practical Database Design and Tuning
ITD1312 Database Principles Chapter 5: Physical Database Design
Interquery Parallelism
Lecture 16: Relational Operators
Database Management Systems (CS 564)
Implementation of Relational Operations (Part 2)
Database Query Execution
Vertical Fragmentation
Distributed Database Management Systems
Database Dr. Roueida Mohammed.
Distributed Database Management Systems
Distributed Database Management Systems
Presentation transcript:

1 Distributed Databases Review CS347 June 6, 2001

2 Fragmentation How to partition relation into various pieces/fragments Types: –Primary Horizontal –Derived Horizontal –Vertical –Hybrid of the above possible Desiderata –Completeness: don’t lose tuples –Disjointness: no duplicate tuples –Reconstruction: make sure you can get back original relation

3 Minterm-based horizontal frag. Simple predicates P r = {p 1, p 2,.., p m } and R. Generate “minterm” predicates from P r Eliminate and simplify (depends on app semantics) Generate fragment  m (R) for each minterm “m”. Simple predicates: –P r must be complete (do not under fragment) and minimal (do not over fragment) –Use predicates occurring in most frequent queries

4 Derived Horizontal R fragmented into {R 1, R 2, …, R n } For S, derive {S 1, S 2, …, S n } where S i =S R i Useful for join queries between R and S For completeness: referential integrity constraint S R For disjointness: join attribute is key of R Vertical Split R by attributes Repeat key attribute in each vertical fragment Attribute affinities define grouping

5 Localization Convert query tree on relations into query tree on fragments Simplify (  up & ,  down) Rules  [R: False]  Ø   C1 [R: C 2 ]  [R: C 1  C 2 ]  [R: C 1 ] [S: C 2 ]  [R S: C 1  C 2  R.A = S.A]  Give vertical fragments R i =  Ai (R), for any B  A:  B (R) =  B [ R i | B  A i  Ø] AA i

6 Distributed Operators Sort –Basic sort (sort each individual fragment) –Range partitioning sort (partition by sort attribute + basic sort) –Parallel external sort merge (local sort + range partition by sort attribute) –Key issue: selecting partitioning vector Join –Partitioned join (only for equi-joins) –Asymmetric fragment+replicate join (fragment R, replicate S) –General fragment+replicate join (fragment and replicate R and S, join all possible pairs) –Semi join programs (to reduce communication cost)

7 Query Optimization Exhaustive + pruning –Enumerate all possible QEP’s with given set of operators –Prune using heuristics (e.g., avoid cartesian products) –Choose minimum cost QEP Hill climbing –Initial feasible QEP + set of QEP transformations –Iterate until no more cost reduction Transform current QEP all possible ways Check cost of each transformed QEP Choose minimum and set as current QEP