CS240A: Databases and Knowledge Bases Carlo Zaniolo Department of Computer Science University of California, Los Angeles December, 2001 Notes From Textbook.

Slides:



Advertisements
Similar presentations
Relational Calculus and Datalog
Advertisements

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Calculus Chapter 4, Part B.
Chapter 3 Tuple and Domain Relational Calculus. Tuple Relational Calculus.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Ver 1,12/09/2012Kode :CCs 111,Sistem basis DataFASILKOM Chapter 5: Other Relational Languages Database System Concepts, 5th Ed. ©Silberschatz, Korth and.
1 541: Relational Calculus. 2 Relational Calculus  Comes in two flavours: Tuple relational calculus (TRC) and Domain relational calculus (DRC).  Calculus.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
1 Relational Calculus Chapter 4 – Part II. 2 Formal Relational Query Languages  Two mathematical Query Languages form the basis for “real” languages.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4.
Relational Algebra Content based on Chapter 4 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw Hill, 2003.
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Relational Calculus. Another Theoretical QL-Relational Calculus n Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus.
Relational Calculus CS 186, Spring 2007, Lecture 6 R&G, Chapter 4 Mary Roth   We will occasionally use this arrow notation unless there is danger of.
CS240A: Databases and Knowledge Bases Recursion and Stratification Carlo Zaniolo Department of Computer Science University of California, Los Angeles December,
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Rutgers University Relational Algebra 198:541 Rutgers University.
Relational Calculus CS 186, Fall 2003, Lecture 6 R&G, Chapter 4   We will occasionally use this arrow notation unless there is danger of no confusion.
CS240A: Databases and Knowledge Bases Time Ontology and Representations Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Rutgers University Relational Calculus 198:541 Rutgers University.
Relational Algebra.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Lecture 3 [Self Study] Relational Calculus
The Relational Model: Relational Calculus
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Calculus Chapter 4, Section 4.3.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Database Management Systems, R. Ramakrishnan1 Relational Calculus Chapter 4.
1 Relational Algebra. 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of data from a database. v Relational model supports.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
Computing & Information Sciences Kansas State University Thursday, 08 Feb 2007CIS 560: Database System Concepts Lecture 11 of 42 Thursday, 08 February.
Relational Calculus Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 17, 2007 Some slide content courtesy.
Relational Calculus R&G, Chapter 4. Relational Calculus Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus (DRC). Calculus.
Relational Calculus CS 186, Spring 2005, Lecture 9 R&G, Chapter 4   We will occasionally use this arrow notation unless there is danger of no confusion.
CS240A: Databases and Knowledge Bases Recursive Queries in SQL 1999 Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 Relational Algebra and Calculas Chapter 4, Part A.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
Relational Algebra.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
IST 210 The Relational Language Todd S. Bacastow January 2004.
Database Systems Relational Calculus. Relational Algebra vs. Relational Calculus Relational algebra queries are relatively easy to implement in.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Module A: Formal Relational.
Computing & Information Sciences Kansas State University Wednesday, 17 Sep 2008CIS 560: Database System Concepts Lecture 9 of 42 Wednesday, 18 September.
Copyright © Curt Hill The Relational Calculus Another way to do queries.
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
CSC 411/511: DBMS Design Dr. Nan WangCSC411_L5_Relational Calculus 1 Relational Calculus Chapter 4 – Part B.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4.
Database Management Systems, R. Ramakrishnan1 Relational Calculus Chapter 4, Part B.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Relational tuple calculus.
Discrete Mathematical Structures: Theory and Applications 1 Logic: Learning Objectives  Learn about statements (propositions)  Learn how to use logical.
Relational Calculus Chapter 4, Section 4.3.
Relational Algebra & Calculus
Ritu CHaturvedi Some figures are adapted from T. COnnolly
Relational Calculus Chapter 4, Part B
Chapter 6: Formal Relational Query Languages
CS 186, Spring 2007, Lecture 6 R&G, Chapter 4 Mary Roth
Database Applications (15-415) Relational Calculus Lecture 6, September 6, 2016 Mohammad Hammoud.
Chapter 6: Formal Relational Query Languages
Chapter 6: Formal Relational Query Languages
CS 186, Spring 2007, Lecture 6 R&G, Chapter 4 Mary Roth
Relational Algebra & Calculus
Relational Calculus Chapter 4, Part B 7/1/2019.
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
Relational Calculus Chapter 4, Part B
Presentation transcript:

CS240A: Databases and Knowledge Bases Carlo Zaniolo Department of Computer Science University of California, Los Angeles December, 2001 Notes From Textbook Advanced Database Systems by Zaniolo, Ceri, Faloutsos, Snodgrass, Subrahmanian and Zicari Morgan Kaufmann, 1997

A relational DB about students and courses Name CourseGrade 'Joe Doe' cs 'Jim Jones' cs 'Jim Jones ' cs 'Jim Black' cs 'Jim Black'cs NameMajorYear 'Joe Doe'cssenior 'Jim Jones'csjunior 'Jim Black'eejunior took student('Joe Doe', cs, senior). student('Jim Jones', cs, junior). student('Jim Black', ee, junior). took('Joe Doe', cs123, 2.7). took('Jim Jones', cs101, 3.0). took('Jim Jones', cs143, 3.3). took('Jim Black', cs143, 3.3). took('Jim Black', cs101, 2.7) The same fact base for Datalog student

Rules  How to express logical conjunction:  Find the name of junior-level students who have taken both cs101 and cs143 firstreq(Name)  student(Name, Major, junior), took(Name, cs101, Grade1), took(Name, cs143, Grade2).  Rule head, rule body. Upper case, lower case, anonymous variables.  The commas in the body represent logical conjunction.

Disjunction Junior-level students who took course cs131 or course cs151 with grade better than 3.0 scndreq(Name)  took(Name, cs131, Grade), Grade > 3.0, student(Name, Major, junior). scndreq(Name)  took(Name, cs151, Grade), Grade > 3.0, student(Name, _, junior ).

QUERIES A closed query; the answer to such query is either yes or not. For instance, ? firstreq( ' Jim Black ' ) An open query: ?firstreq(X) and its answer: firstreq( ' Jim Jones ' ) firstreq( ' Jim Black ' ) Much power and convenience in cascading!!! E.g., Both requirements must be satisfied to enroll in cs298 reqcs298(Name)  firstreq(Name), scndreq(Name).

The Relational Model versus Datalog  Terminology Relational Model Table or Relation Row or Tuple Column View Datalog Base Predicate Fact Argument Derived Predicate

Negation in Datalog  Only goals can be negated.  Negated heads are not allowed!  Junior-level Students who did not take course cs143 hastaken(Name, Course)  took(Name, Course, Grade). lacks_cs143(Name)  student(Name, _, junior),  hastaken(Name, cs143).

Universal Quantification by Double Negation  Find the senior students who completed all the requirements for the cs major: ?all_req_sat(X)  The first step is that of formulating the complementary query: Find students who did not take some of the courses required for a cs major.  We can now re-express the original query as: Find the senior students who are NOT missing any requirement req_missing(Name)  student(Name,_,senior ), req(cs, Course),  hastaken(Name, Course). all_req_sat(Name)  student(Name, _, senior),  req_missing(Name).

Domain Relational Calculus  Relational calculus comes in two main flavors: 1. in the Domain Relational Calculus (DRC) the variables denote values of attributes, 2. in the Tuple Relational Calculus (TRC) variables denote whole tuples.  In DRC, the query ``Find the name of junior-level students who have taken both cs101 and cs143'‘ { (N)   G 1 (took(N, cs101, G 1 ))  G 2 (took(N, cs143, G 2 ))   M student(N, M, junior)) }

Domain Relational Calculus (cont.) The query ? scndreq(N) can be expressed as follows: { (N)   G,  M(took(N, cs131, G)  G >3.0  student(N, M, junior))   G,  M (took(N, cs151, G)  G >3.0  student(N, M, junior)) }  DRC presents several syntactic differences w.r.t. Datalog:  set-definition by abstraction (rather than rules)  conjunctions and disjunctions in the same formula,  nesting of parentheses, and  explicit quantifiers.

Explicit Quantifiers  Existential and universal quantification are both allowed in DRC.  A query such as ?all_req_sat(N) can be expressed either by  using double negation (and only existential quantifiers)  or directly using the universal quantifier: Example: Find the seniors who completed all cs requirements: { (N)   M (student(N, M, senior))    C (req(cs, C)   G (took(N, C, G)) }  The implication sign: p  q is a shorthand for  p  q.

Tuple Relational Calculus (TRC)  In TRC, variables range over the tuples of a relation. For instance, the TRC expression for the query ?firstreq(N) is: { (t[1])   u  s (took(t)  took(u)  student(s)   t[2] = cs101  u[2] = cs143  t[1] = u[1]   s[3] = junior  s[1] = t[1] ) }  The variables t and s, respectively denote tuples ranging over took and student. t[1] denotes the first component in t (corresponding to Name );  TRC requires an explicit statement of equality (e.g., s[1] = t[1]), while in DRC equality is denoted implicitly by the presence of the same variable in different places.

Relational DB Languages  The various languages are quite different, but they have the same expressive power  Safe TRC and DRC expressions are equivalent, and there are mappings that transform any formula in one language into an equivalent one in the other.  For each TRC or DRC formula there is an equivalent, nonrecursive Datalog program. The converse is also true, since a nonrecursive Datalog program can be mapped into an equivalent DRC query.  Another language equivalent to these, is relational algebra (RA).  RA is an operator-based language, and thus provides a useful link to concrete implementation of these logic-based languages.  Languages that can express every query expressible in these languages are called relational complete.  Relational completeness is necessary but it is much less than Turing Completeness & no longer sufficient in the commercial world (ergo the mission of CS240A)

Commercial DB Languages  The actual query languages of commercial RDMS are largely based on the formal query languages just discussed. For instance:  Query-By-Example (QBE) is a visual query language based on DRC  Languages such as QUEL and SQL are instead based on TRC.  In QUEL and SQL, the notation t.Name and t.Course are used instead of t[1] and t[2]; also existential quantification is (resp.) replaced by the constructs RANGE and FROM.  RA algebra provides a good basis for the efficient implementation of these relational languages.

Relational Algebra (RA) A family of operators on relations that have the closure property: take relations as arguments and return relations as result. Union. The union of relations R and S, denoted R  S, is the set of tuples that are in R, or in S, or in both. R  S = { t  t  R  t  S } This operation is defined only if R and S have the same number of columns. Set difference Tuples tha belong to R but not to S. R - S = { t  t  R  r (r  S  t = r) } This operation is defined only if R and S have the same number of columns. Say that that number is n. Then: t=r denotes that t[1] = r[1]  t[n] = r[n]).

Relational Operators Cartesian product. R×S ={ t  (  r  R ) (  s  S) (t[1, , n]=r  t[n+1, , n+m]=s)} If R has n columns and S has m columns, then R ×S contains all the possible m+n tuples whose first m components form a tuple in R and the last n components form a tuple in S. Thus, R ×S has m+n columns and  R  ×  S  tuples, where  R  and  S  denote the respective cardinalities of the two relations. Projection. Let L1 be a sub list of the columns of R (with possible reordering):  L1 R = { r[L  ]  r  R }

Relational Operators (cont) Selection.  F R denotes the selection on R according to the selection formula F, where F obeys one of the following patterns: $i  C, where i is a column of R,  is an arithmetic comparison operator, and C is a constant, or $i  $j, where $i and $j are columns of R, and  is an arithmetic comparison operator, or an expression built from terms such as those described in (i) and (ii), above, and the logical connectives , , and . Then:  F R = { t  t  R  F} where F denotes the formula obtained from F by replacing $i and $j with t[i] and t[j]. For example, if F is ``$2 = $3  $1 = bob'', then F is ``t[2] = t[3]  t[1] = bob''. Thus:  $2 = $3  $1 = bob R = { t  t  R  t[2] = t[3]  t[1] = bob }.  All previous operators, but set-difference, are monotonic.

Additional Operators  Addditional operators of frequent use can be derived from these. For instance, we have join, semijoin, intersection, division and generalized projection.  The join operator: R S, can be constructed using Cartesian product and selection. R S =  F ( R × S) where F = $i 1  1 $j 1  i k  k $j k ; i 1, , i k are columns of R; j 1, , i k are columns of S; and  1, ,  k are comparison operators. Then, if R has arity m, we define F = $i 1  1 $(j 1 +m )  $i k  k $(j k +m ).

Additional Operators (cont.)  The intersection of two relations can be constructed either by taking the equijoin of the two relations in every column (and then projecting out duplicate columns) or by using the following property: R  S = R-(R-S) = S-(S-R).  The generalized projection of a relation R is denoted  L (R), where L is a list of column numbers and constants. Unlike ordinary projection, components might appear more than once, and constants as components of the list L are permitted e.g.,  $1,c,$1 is a valid generalized projection

Unsafe Rules An unsafe Rule: to find grades better than the grade Joe Doe got in cs143, a user might write bettergrade(G1)  took(‘Joe Doe’, cs143, G), G1 > G. In finite answers. Assuming that, say Joe Doe got the grade of 3.3 (i.e., B+) in course cs143, then, there are infinitely many numbers that satisfy the conditions of being greater than 3.3. Lack of domain independence. A query formula is said to be domain independent when its answer only depends on the database and the constants in the query, but not on the domain of interpretation. The set of values for G1 satisfying the rule above depends on what domain we assume for numbers: e.g., integer, rational or real. No relational algebra equivalent. RA expression take DB tables and constant as operands and return finite relations.

Safety  In practical languages, it is desirable to allow only safe formulas, which avoid the problems of infinite answers, and loss of domain independence.  But the problems of domain independence and finiteness of answers are undecidable even for non-recursive queries. Therefore, necessary and sufficient syntactic conditions that characterize safe formulas cannot be given in general.  In practice, therefore, sufficient conditions are defined that might be a more restrictive than necessary.

Safe Datalog: an inductive definition 1. Safe Predicates. A predicate q of P is safe if (i) q is a database predicate, or (ii) every rule defining q is safe 2. Safe Variables. A variable X in rule r is safe if (i) X is contained in some positive goal q(t 1,..., t n ), where the predicate q(A 1,..., A n ) is safe, or (ii) r contains some equality goal X = Y, where Y is safe. 3. Safe Rules. A rule r is safe if all its variables are safe 4. The goal ?q(t 1,..., t n ) is safe when the predicate q(A 1,..., A n ) is safe.

From Safe Rules to RA [ Step 1] P is transformed into an equivalent program P that does not contain any equality goal by replacing equals with equals and removing the equality goals. For example: s(Z,b,W)  q(X,X,Y),p(Y,Z,a), W=Z, W > Is translated into: s(Z,b,Z)  q(X,X,Y), p(Y,Z,a), Z > 24.3.

Mapping [Step 2] The body of r is translated into the RA expression Body r Body r is the cartesian product of all (base or derived) relations in the body, followed by a selection  F, where F is the conjunction of the following conditions: (i) inequality for each such goal (e.g., Z > 24.3), (ii) equality between columns containing the same variables (iii) equality between a column and the constant therein, e.g. s(Z,b,Z)  q(X,X,Y), p(Y,Z,a), Z > (i) Z > 24.3 translates into the selection condition $5 > 24.3, (ii) the two occurrences of X translates into $1 = $2, while the the two Ys maps into $3 = $4, and (iii) the constant in the last column of P maps into $6 = a. Thus : Body r =  $1 = $2, $3 = $4, $6 = a, $5 > 24.3 (Q ×P)

Mapping Datalog to RA [Steps 3 & 4] Step 3: Each rule r is translated into an extended projection on Body r, according to the patterns in the head of r. For the rule at hand s(Z,b,Z)  q(X,X,Y),p(Y,Z,a),Z > we obtain: S =  $5, b, $5 ( Body r ) Step 4: Multiple rules with the same head are translated into the union or their equivalent expressions.

Equivalence of RA and Safe Nonrecursive Datalog Programs  Negated Goals: A little more complex--- see homework! Equivalence of RA and Safe Nonrecursive Datalog  Theorem: Let P be a safe Datalog program without recursion or function symbols. Then, for each predicate in P, there exists an equivalent relational algebra expression.

Relational Tables for a BoM application PART_COST BASIC_PARTSUPPLIERCOSTTIME top_tubecinelli top_tubecolumbus down_tubecolumbus head_tubecinelli head_tubecolumbus seat_mastcinelli seat_mastcinelli seat_staycinelli seat_staycolumbus chain_staycolumbus forkcinelli forkcolumbus spokecampagnolo nipplemavic0.103 hubcampagnolo hubsuntour rimmavic rimaraya7.001 ASSEMBLY PART SUBPARTQTY bikeframe1 bikewheel2 frametop_tube1 framedown_tube 1 framehead_tube1 frameseat_mast1 frameseat_stay2 framechain_stay2 framefork1 wheelspoke36 wheelnipple 1 wheelrim1 wheelhub1 wheeltire1