SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.

Slides:



Advertisements
Similar presentations
พีชคณิตแบบสัมพันธ์ (Relational Algebra) บทที่ 3 อ. ดร. ชุรี เตชะวุฒิ CS (204)321 ระบบฐานข้อมูล 1 (Database System I)
Advertisements

CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
Query Folding Xiaolei Qian Presented by Ram Kumar Vangala.
16.4 Estimating the Cost of Operations Project GuidePrepared By Dr. T. Y. LinVinayan Verenkar Computer Science Dept San Jose State University.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
CS CS4432: Database Systems II Logical Plan Rewriting.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
1 FOL CS 331/531 Dr M M Awais Composition Find Overall Substitutions Given two or more sets of substitutions.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Relational Algebra Dashiell Fryer. What is Relational Algebra? Relational algebra is a procedural query language. Relational algebra is a procedural query.
OR Simplex method (algebraic interpretation) Add slack variables( 여유변수 ) to each constraint to convert them to equations. (We may refer it as.
1 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke Deductive Databases Chapter 25.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Efficient Query Evaluation on Probabilistic Databases
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
SECTIONS 21.4 – 21.5 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.
FDImplication: 1 Functional Dependencies (FDs) Let r(R) be a relation and let t  r, then the restriction of t to X  R, written t[X], is the projection.
Capability-Based Optimization in Mediators Rohit Deshmukh ID 120 CS-257 Rohit Deshmukh ID 120 CS-257.
Catriel Beeri Pls/Winter 2004/5 type reconstruction 1 Type Reconstruction & Parametric Polymorphism  Introduction  Unification and type reconstruction.
Local-as-View Mediators Priya Gangaraju(Class Id:203)
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Dr. Alexandra I. Cristea CS 319: Theory of Databases: C3.
Lesson 6. Refinement of the Operator Model This page describes formally how we refine Figure 2.5 into a more detailed model so that we can connect it.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 MA 1128: Lecture 09 – 6/08/15 Solving Systems of Linear Equations.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
Function A function is a relation in which, for each distinct value of the first component of the ordered pair, there is exactly one value of the second.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
The Relational Model: Relational Calculus
TH EDITION LIAL HORNSBY SCHNEIDER COLLEGE ALGEBRA.
Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 2 Graphs and Functions Copyright © 2013, 2009, 2005 Pearson Education, Inc.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Submitted by: Deepti Kundu Submitted to: Dr.T.Y.Lin
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 Relational Algebra and Calculas Chapter 4, Part A.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
Relational Algebra.
INFORMATION INTEGRATION Shengyu Li CS-257 ID-211.
Chapter 5 – Relations and Functions. 5.1Cartesian Products and Relations Definition 5.1: For sets A, B  U, the Cartesian product, or cross product, of.
1 Combinatorial Algorithms Local Search. A local search algorithm starts with an arbitrary feasible solution to the problem, and then check if some small,
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
Al-Maarefa College for Science and Technology INFO 232: Database systems Chapter 3 “part 2” The Relational Algebra and Calculus Instructor Ms. Arwa Binsaleh.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
Information Integration By Neel Bavishi. Mediator Introduction A mediator supports a virtual view or collection of views that integrates several sources.
Copyright, Harris Corporation & Ophir Frieder, The Process of Normalization.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Module A: Formal Relational.
OR Simplex method (algebraic interpretation) Add slack variables( 여유변수 ) to each constraint to convert them to equations. (We may refer it as.
Web Science & Technologies University of Koblenz ▪ Landau, Germany Relational Data Model.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
An Algorithm for the Consecutive Ones Property Claudio Eccher.
Chapter 11 (Part 1): Boolean Algebra
Module 11: File Structure
CS4432: Database Systems II
Database Management System
Answering Queries using Templates with Binding Patterns
Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (i.e., “contained in” as described in the book) another.
Chapter 12: Query Processing
Instructor: Mohamed Eltabakh
Local-as-View Mediators
Chapter 6: Formal Relational Query Languages
5.4 T-joins and Postman Problems
Presentation transcript:

SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION

Presentation Outline  21.5 Optimizing Mediator Queries  Simplified Adornment Notation  Obtaining Answers for Subgoals  The Chain Algorithm  Incorporating Union Views at the Mediator

21.5 Optimizing Mediator Queries  Chain algorithm – a greed algorithm that finds a way to answer the query by sending a sequence of requests to its sources.  Will always find a solution assuming at least one solution exists.  The solution may not be optimal.

Simplified Adornment Notation  A query at the mediator is limited to b (bound) and f (free) adornments.  We use the following convention for describing adornments:  name adornments (attributes)  where: name is the name of the relation the number of adornments = the number of attributes

Obtaining Answers for Subgoals  Rules for subgoals and sources:  Suppose we have the following subgoal: R x 1 x 2 …x n (a 1, a 2, …, a n ), and source adornments for R are: y 1 y 2 …y n. If y i is b or c[S], then x i = b. If x i = f, then y i is not output restricted.  The adornment on the subgoal matches the adornment at the source: If y i is f, u, or o[S] and x i is either b or f.

The Chain Algorithm  Maintains 2 types of information:  An adornment for each subgoal.  A relation X that is the join of the relations for all the subgoals that have been resolved.  Initially, the adornment for a subgoal is b iff the mediator query provides a constant binding for the corresponding argument of that subgoal.  Initially, X is a relation over no attributes, containing just an empty tuple.

The Chain Algorithm (con’t)  First, initialize adornments of subgoals and X.  Then, repeatedly select a subgoal that can be resolved. Let R α (a 1, a 2, …, a n ) be the subgoal: 1. Wherever α has a b, we shall find the argument in R is a constant, or a variable in the schema of R.  Project X onto its variables that appear in R.

The Chain Algorithm (con’t) 2. For each tuple t in the project of X, issue a query to the source as follows ( β is a source adornment).  If a component of β is b, then the corresponding component of α is b, and we can use the corresponding component of t for source query.  If a component of β is c[S], and the corresponding component of t is in S, then the corresponding component of α is b, and we can use the corresponding component of t for the source query.  If a component of β is f, and the corresponding component of α is b, provide a constant value for source query.

The Chain Algorithm (con’t)  If a component of β is u, then provide no binding for this component in the source query.  If a component of β is o[S], and the corresponding component of α is f, then treat it as if it was a f.  If a component of β is o[S], and the corresponding component of α is b, then treat it as if it was c[S]. 3. Every variable among a 1, a 2, …, a n is now bound. For each remaining unresolved subgoal, change its adornment so any position holding one of these variables is b.

The Chain Algorithm (con’t) 4. Replace X with X π s(R), where S is all of the variables among: a 1, a 2, …, a n. 5. Project out of X all components that correspond to variables that do not appear in the head or in any unresolved subgoal.  If every subgoal is resolved, then X is the answer.  If every subgoal is not resolved, then the algorithm fails. α

The Chain Algorithm Example  Mediator query:  Q: Answer(c) ← R bf (1,a) AND S ff (a,b) AND T ff (b,c)  Example: Relation R S T Data Adornment bfc’[2,3,5]f bu wx xy yz

The Chain Algorithm Example (con’t)  Initially, the adornments on the subgoals are the same as Q, and X contains an empty tuple.  S and T cannot be resolved because they each have ff adornments, but the sources have either a b or c.  R(1,a) can be resolved because its adornments are matched by the source’s adornments.  Send R(w,x) with w=1 to get the tables on the previous page.

The Chain Algorithm Example (con’t)  Project the subgoal’s relation onto its second component, since only the second component of R(1,a) is a variable.  This is joined with X, resulting in X equaling this relation.  Change adornment on S from ff to bf. a 2 3 4

The Chain Algorithm Example (con’t)  Now we resolve S bf (a,b):  Project X onto a, resulting in X.  Now, search S for tuples with attribute a equivalent to attribute a in X.  Join this relation with X, and remove a because it doesn’t appear in the head nor any unresolved subgoal: ab b 4 5

The Chain Algorithm Example (con’t)  Now we resolve T bf (b,c):  Join this relation with X and project onto the c attribute to get the relation for the head.  Solution is {(6), (7), (8)}. bc

Incorporating Union Views at the Mediator  This implementation of the Chain Algorithm does not consider that several sources can contribute tuples to a relation.  If specific sources have tuples to contribute that other sources may not have, it adds complexity.  To resolve this, we can consult all sources, or make best efforts to return all the answers.

Incorporating Union Views at the Mediator (con’t)  Consulting All Sources  We can only resolve a subgoal when each source for its relation has an adornment matched by the current adornment of the subgoal.  Less practical because it makes queries harder to answer and impossible if any source is down.  Best Efforts  We need only 1 source with a matching adornment to resolve a subgoal.  Need to modify chain algorithm to revisit each subgoal when that subgoal has new bound requirements.

Questions