Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC 504: Data Management Review of Relational Model 2/2 Laks V.S. Lakshmanan Dept. of CS UBC.

Similar presentations


Presentation on theme: "CPSC 504: Data Management Review of Relational Model 2/2 Laks V.S. Lakshmanan Dept. of CS UBC."— Presentation transcript:

1 CPSC 504: Data Management Review of Relational Model 2/2 Laks V.S. Lakshmanan Dept. of CS UBC

2 Getting at the data – Querying Relational DBs are queried with SQL. But where did that come from what is the basis for it? Relational DBs can be queried using logic. In fact, we will review some logic-based QLs. SQL = logic + some practically crucial features like aggregation & nesting.

3 Logic Query Language(s) stocks(Ticker, Company), prices(Date, Ticker, Type, Value), indexes(Date, DOW, TSX, S&P). Find the ticker of “Syncrude Corp.”: –{T.Ticker | stocks(T) & T.Company = “Syncrude Corp.”}. Find the Tickers of companies, company names, and the corresponding closing prices on those days when DOW was more than 12,000. –{(T.Ticker, T.Company, P.Date, P.Value) | stocks(T) & prices(P) & indexes(I) & T.Ticker=P.Ticker & P.Date=I.Date & I.DOW>=12000 & P.Type=`closing’}.

4 Logic QL(s) – Tuple Relational Calculus TRC key features: –Tuple variables (basic unit) –Output tuple assembled from pieces of tuple vars –Conditions imposed as “built-in” predicates –Quantifiers Quantifier example: Find stocks (tickers) which had a higher closing price than every other company on August 15, 2011. {(T.Ticker) | stocks(T) & (  P1)[prices(P1) & T.Ticker=P1.Ticker & P1.Type=`closing’ & P1.Date=2011/08/15 & (  P2)[prices(P2) & P2.Date=2011/08/15 & P2.Type=`closing’  P2.Value ≤ P1.Value]]}.

5 Logic QL – Datalog (in lieu of Domain Relational Calculus) Rule-based query language. Syntax similar to DRC. Supports recursion. E.g.: Q1: q1(T)  stocks(T, `Syncrude Corp.’). Q2: q2(T, C, D, P)  stocks(T, C) & prices(D, T, `closing’, P) & indexes(D, DJ, W1, W2) & DJ >= 12000.

6 Datalog (contd.) Note the use of variables and constants as predicate arguments. Database predicates vs. built-in predicates. Base tables vs. derived tables (aka views). Rule ::= Head  Body. Head – a DB predicate. Body – a conjunction of DB and built-in predicates. Query – a set of rules, defining a query predicate. Rules need to be safe.

7 Datalog (contd.) There is an implicit  in front of every rule body. – e.g.? Can we express  at all? E.g.: Q3: q3(T)  stocks(T, C) &  bad(T). bad(T1)  stocks(T1, C1) & stocks(T2, C2) & prices(2007/08/15, T1, `closing’, V1) & prices(2007/08/15, T2, `closing’, V2) & V2 > V1.

8 Datalog (contd.) Datalog can go beyond what we have just seen. Recursion: e.g., let flights(F, T) denote there is a direct flight from city F to city T. Find all cities you can fly to from Vancouver, possibly in a series of hops. flyTo(X, Y)  flights(X, Y). flyTo(X, Y)  flights(X, Z) & flyTo(Z, Y). ?– flyTo(`Vancouver’, Y).

9 Datalog wrap up. Efficient query answering – esp. when recursion, negation, aggregation  (will see shortly), or combos are present. Powerful QL. Numerous efficient QP strategies have been developed.

10 Relational Algebra RA is based on five simple ops – select, project, Cartesian (aka cross) product, union, minus. When combined, it makes for a rather powerful QL, equiv. in expressive power, to TRC or Datalog w/o recursion. You just need efficient algorithms for basic ops and useful macros. And a query optimizer that chooses the best plan for evaluating a query based on estimated cost, using a cost model.

11 RA Select:  Company=`Sybcrude Corp.’ (stocks) – filter out tuples whose value for Company is `Syncrude Corp.’ Project:  Ticker (stocks) – find all tickers. Product: stocks x prices – find all combinations of tuples from the two relations. Union:  Ticker (stocks)   Ticker (prices). Minus:  Ticker (stocks)  Ticker (prices).

12 RA Example “macros”: Join and division – examples. Other macros: In implementing operators, you want to piggyback when it makes sense: e.g., if we want to compute a Join;select;project cascade, we can do select and project “for free” on the fly, while paying only for joining. Exercise: Express Q1—Q3 in RA.

13 SQL (Structured Query Language) Inspired mostly by TRC. Ad hoc additions – partly inspired by RA and partly by need. –“Natural join”, “left outer join”, etc. –SUM(Sal), AVG(Height), etc. –Nesting queries inside others. SQL can also express updates, unlike the “pure” QLs seen so far.

14 SQL review (contd.) Q1: select Ticker from stocks where Company=`Syncrude Corp.’ What is the connection to TRC? Q2: select S.Ticker, Company, P.Date, Value from stocks S, prices P, indexes I where S.Ticker=P.Ticker AND P.Date=I.Date AND I.DOW>=12000

15 SQL review (contd.) Q3: select S.Ticker from stocks S where NOT EXISTS ( select * from stocks S2, prices P1, prices P2 where P1.Date=2007/08/15 AND P2.Date=2007/08/15 AND S.Ticker=P1.Ticker AND S2.Ticker=P2.Ticker AND P1.Value < P2.Value )

16 SQL review wrap up Q3 can be expressed more concisely using grouping and aggregation. Q4: Find the average value of each type of price. select Type, AVG(Value) from prices group by Type

17 SQL updates We can explicitly insert a tuple of values into a table. Can modify select fields of a specific tuple. Can perform query-driven updates.

18 SQL DDL Can define schema. Can define ICs and triggers.

19 Intro. to Conjunctive Queries In datalog, a rule of the form: H  B1,..., Bm. -range-restricted and safe. e.g., p(X,Y)  a(X,Z), b(Z,W), c(Z,Y), W>1. In SQL, single block queries w/ no agg or grouping. In RA, SPJ queries. Tableau Queries.

20 Concurrency control Supports access by multiple users/processes, while preserving integrity of data. E.g.: child checking account balance. father depositing money into account. Mother making a withdrawal. Each transaction = read;change; write. Should be interleaved carefully to prevent incorrect state!

21 Transactions Atomicity: either a transaction as a whole succeeds, or fails; nothing part way. Consistency: only transactions that respect DB’s ICs are allowed. Isolation: at any time, the schedule of actions (coming from diff. transactions) being performed is serializable, i.e., is equivalent to running them one transaction at a time. Durability: after a commit, the effect of a trsnsaction persists.

22 Recovery From disk failures – done through RAID. From power failures – done by keeping a detailed log of transactions (actions) performed. Roll back if need be to preserve correct state.

23 DBMS Architecture

24 Summing it all up DBMS – one of the most sophisticated mission-critical software systems. Real DBMSs – tend to be complex with many components. Query Optimizer, Transaction Manager, Disk Space Manager – key components. Based on decades of solid research. In some ways, RDBMS as a model and as a technology – a gold standard: –For data models. –For software systems.

25 Further Reading In addition to the list already seen: P. Bernstein, V. Hadzilacos, and N. Goodman: Concurrency Control and Recovery in Database Systems. J. Gray and A. Reuter: Transaction Processing: Concepts and Techniques. M. Stonebraker and J. Hellerstein: Readings in DB Systems (the red book) – contains several great papers (on CC & Recovery and other topics).


Download ppt "CPSC 504: Data Management Review of Relational Model 2/2 Laks V.S. Lakshmanan Dept. of CS UBC."

Similar presentations


Ads by Google