Institut für Scientific Computing – Universität WienP.Brezany Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Physical Database Design and Tuning R&G - Chapter 20 Although the whole of this life were said to be nothing but a dream and the physical world nothing.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Distributed Database Management Systems Lecture 15.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
Relational Algebra Dashiell Fryer. What is Relational Algebra? Relational algebra is a procedural query language. Relational algebra is a procedural query.
Distributed Query Processing –An Overview
Distributed Database Design & Semantic Data Control
1 Distributed Databases Chapter Two Types of Applications that Access Distributed Databases The application accesses data at the level of SQL statements.
1 Distributed Databases Review CS347 June 6, 2001.
Institut für Scientific Computing – Universität WienP.Brezany Optimization of Distributed Queries Univ.-Prof. Dr. Peter Brezany Institut für Scientific.
Institut für Scientific Computing – Universität WienP.Brezany Parallel Databases Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Relational Model & Relational Algebra. 2 Relational Model u Terminology of relational model. u How tables are used to represent data. u Connection between.
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
Simpson Rule For Integration.
DISTRIBUTED DATABASE DESIGN
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Chapter 2 Adapted from Silberschatz, et al. CHECK SLIDE 16.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
The Relational Algebra. 1 RELATIONAL ALGEBRARELATIONAL ALGEBRA 2 UNARY RELATIONAL OPERATIONS * SELECT OPERATIONSELECT OPERATION * PROJECT OPERATIONPROJECT.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com ICOM 5016 – Introduction.
DDBMS Distributed Database Management Systems Fragmentation
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Chapter 6 The Relational Algebra Copyright © 2004 Ramez Elmasri and Shamkant Navathe.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 6- 1.
Chapter 2 Introduction to Relational Model. Example of a Relation attributes (or columns) tuples (or rows) Introduction to Relational Model 2.
Al-Maarefa College for Science and Technology INFO 232: Database systems Chapter 3 “part 2” The Relational Algebra and Calculus Instructor Ms. Arwa Binsaleh.
Advanced Relational Algebra & SQL (Part1 )
Chapter 2: Intro to Relational Model. 2.2 Example of a Relation attributes (or columns) tuples (or rows)
Design Process - Where are we?
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CSC271 Database Systems Lecture # 7. Summary: Previous Lecture  Relational keys  Integrity constraints  Views.
1 Distributed Databases architecture, fragmentation, allocation Lecture 1.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
LECTURE THREE RELATIONAL ALGEBRA 11. Objectives  Meaning of the term relational completeness.  How to form queries in relational algebra. 22Relational.
CS742 – Distributed & Parallel DBMSPage 2. 1M. Tamer Özsu Outline Introduction & architectural issues  Data distribution  Fragmentation  Data Allocation.
Chapter 71 The Relational Data Model, Relational Constraints & The Relational Algebra.
COP Introduction to Database Structures
Ritu CHaturvedi Some figures are adapted from T. COnnolly
Practical Database Design and Tuning
Functional Dependency and Normalization
Module 2: Intro to Relational Model
Introduction to Relational Model
Chapter 2: Intro to Relational Model
Chapter 3: Intro to Relational Model
ER Modeling Exercise Consider a set of courses, both at grad and undergrad level. Each course has at least one section. Each section is taught by only.
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Practical Database Design and Tuning
Outline Introduction Background Distributed DBMS Architecture
Distributed Database Management Systems
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Outline Introduction Background Distributed DBMS Architecture
Presentation transcript:

Institut für Scientific Computing – Universität WienP.Brezany Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität Wien Tel Sprechstunde: Di, LV-Portal:

Institut für Scientific Computing – Universität WienP.Brezany 2 Introduction We already presented the various fragmentation strategies. Fragmentation strategies: –horizontal –vertical –nesting fragments in a hybrid fashion.

Institut für Scientific Computing – Universität WienP.Brezany 3 Horizontal Fragmentation Primary horizontal fragmentation of a relation is performed using predicates that are defined on that relation. Derived horizontal fragmentation is the partitioning of a relation that results from predicates being defined on another relation. Information requirements of horizontal fragmentation –Database information –Application information

Institut für Scientific Computing – Universität WienP.Brezany 4 Database Example

Institut für Scientific Computing – Universität WienP.Brezany 5 Database Information It concerns the global conceptual schema. In this context it is important to note how the database relations are connected to one another, especially with joins. Example: Given link L 1 of the above figure, the owner and member functions have the the following values: owner(L 1 ) = PAY; member(L 1 ) = EMP The quantitative information required about the database is the cardinality of each relation R, denoted card(R).

Institut für Scientific Computing – Universität WienP.Brezany 6 Application Information It is required: –qualitative information, which guides the fragmentation activity –quantitative information is incorporated primarily into the allocation models. The fundamental qualitative information consists of the predicates used in user queries.It is not possible to analyze all of the user applications to determine these predicates  one should at least investigate the most “important“ ones.  a rule of thumb: the most active 20% of user queries account for 80% of the total data access. Simple predicates: Given a relation R(A 1, A 2,..., A n ), where A i is an attribute defined over domain D i, a simple predicate p j defined on R has the form p j : A i  Value where   { , , , , ,  } and Value  D i. We use Pr i to denote the set of all simple predicates defined on a relation R i. The members of Pr i are denoted by p ij. Example: for the relation instance PROJ: PNAME = “Maintenance“ BUDGET 

Institut für Scientific Computing – Universität WienP.Brezany 7 Application Information (cont.) User queries often include more complicated predicates, which are Boolean combinations of simple predicates. One important combination: minterm predicate – conjunction of simple predicates. Given a set of simple predicates for relation R i, the set of minterm predicates is defined as where or So each simple predicate can occur in a minterm predicate either in its natural form or ist negated form.

Institut für Scientific Computing – Universität WienP.Brezany 8 Application Information (cont.) Example:

Institut für Scientific Computing – Universität WienP.Brezany 9 Application Information (cont.) In terms of quantitative information about the user applications, we need to have 2 sets of data: 1.Minterm selectivity: number of tuples of the relation that would be accessed by a user query specified according to a given minterm predicate. E.g., in previous example, sel(m 1 )=0 since there are no tuples in PAY that satisfy the minterm predicate. sel(m 2 )=1 2.Access frequency: frequency with which user applications access data. If Q = {q 1, q 2,..., q q } is a set of user queries, acc(q i ) indicates the access frequency of query q i in a given period. The minterm access frequencies can be determined from the query frequencies  acc(m i ) – the access frequency of a minterm m i.

Institut für Scientific Computing – Universität WienP.Brezany 10 Primary Horizontal Fragmentation It is defined by a selection operation on the owner relations of a database schema. Given a relation R, its horizontal fragments are given by R i =  Fi (R), 1  i  w where F i is the selection formula used to obtain fragment R i. Example : PROJ  PROJ 1 and PROJ 2 PROJ 1 =  BUDGET  (PROJ) PROJ 2 =  BUDGET  (PROJ)

Institut für Scientific Computing – Universität WienP.Brezany 11 Primary Horizontal Fragmentation (cont.) Example:

Institut für Scientific Computing – Universität WienP.Brezany 12 Primary Horizontal Fragmentation (cont.) A more formal definition of a horizontal fragment: A horizontal fragment of relation R i consists of all the tuples of R that satisfy a minterm predicate m j. Hence, given a set of minterm predicates M, there are as many horizontal fragments of R as there are minterm predicates.  minterm fragments. An important aspect of simple predicates is their completeness; another is their minimality. A set od simple predicates Pr is said to be complete if and only if there is an equal probability of access by every application to any tuple belonging to any minterm fragment that is defined according to Pr.

Institut für Scientific Computing – Universität WienP.Brezany 13 Primary Horizontal Fragmentation (cont.) Example: Consider the fragmentation of PROJ in the last example. If the only application that accesses PROJ wants to access the tuples according to the location, the set is complete since each tuple of each fragment PROJ i, has the same probability of being accessed. If there is a second application which accesses only those project tuples where the budegt is less than $ , then Pr is not complete. Some of the tuples within each PROJ i have a higher probability of being accessed due to this second application. To make the set of predicates complete, we need to add (BUDGET  , BUDGET > 20000) to Pr: Pr = {LOC=“Montreal“, LOC=“New York“, LOC=“Paris“, BUDGET  , BUDGET > 20000}

Institut für Scientific Computing – Universität WienP.Brezany 14 Primary Horizontal Fragmentation (cont.) The second desirable property of the set of predicates, according to which minterm predicates and turn, fragments are to be defined, is minimality. If a predicate influences how fragmentation is performed (i.e., causes a fragment f to be further fragmented into, say, f i and f j ), there should be at least one application that accesses f i and f j differently. In other words, the simple predicate should be relevant in determining a fragmentation. If all the predicates of a set Pr are relevant, Pr is minimal.

Institut für Scientific Computing – Universität WienP.Brezany 15 Primary Horizontal Fragmentation (cont.) Example: The set Pr defined in the previous example is complete and minimal. If, however, we were to add the predicate PNAME = „Instrumentation“ to Pr, the resulting set would not be minimal since the new predicate is not relevant with respect to Pr. There is no application that would access the resulting fragments any differently.

Institut für Scientific Computing – Universität WienP.Brezany 16 Derived Horizontal Fragmentation A derived horizontal fragmentation is defined on a member relation of a link according to a selection operation specified on its owner. Given a link L where owner(L) = S and member(L) = R, the derived horizontal fragments of R are defined as R i = R  S i, 1  i  w where w is the maximum number of fragments that will be defined on R, and S i =  Fi ( S), where Fi is the formula according to which the primary horizontal fragment S i is defined.

Institut für Scientific Computing – Universität WienP.Brezany 17 Derived Horizontal Fragmentation (cont.) Example: Consider link L 1, where owner(L 1 ) = PAY and member(L 1 ) = EMP. Then we can group engineers into 2 groups according to their salary:  $ and > $ The 2 fragments EMP 1 and EMP 2 are defined: EMP 1 = EMP  PAY 1 EMP 2 = EMP  PAY 2 where PAY 1 =  SAL  ( PAY) PAY 2 =  SAL>30000 ( PAY) Derived horizontal fragmentation of EMP