Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECS 647: Introduction to Database Systems

Similar presentations


Presentation on theme: "EECS 647: Introduction to Database Systems"— Presentation transcript:

1 EECS 647: Introduction to Database Systems
Instructor: Luke Huan Spring 2007

2 Luke Huan Univ. of Kansas
Administrative The following book has been reserved in the engineering library: Data mining : concepts and techniques by Jiawei Han and Micheline Kamber 11/9/2018 Luke Huan Univ. of Kansas

3 Review: Data Integrity in SQL
Constraints Trigger 11/9/2018 Luke Huan Univ. of Kansas

4 Luke Huan Univ. of Kansas
Trigger example CREATE TRIGGER EECS168AutoRecruit AFTER INSERT ON Student REFERENCING NEW ROW AS newStudent FOR EACH ROW WHEN (newStudent.GPA > 3.0) INSERT INTO Enroll VALUES(newStudent.SID, ’EECS168’); Event Condition Action 11/9/2018 Luke Huan Univ. of Kansas

5 Luke Huan Univ. of Kansas
Today’s Topic View Index Transaction Recursion 11/9/2018 Luke Huan Univ. of Kansas

6 Luke Huan Univ. of Kansas
Views A view is like a “virtual” table Defined by a query, which describes how to compute the view contents on the fly DBMS stores the view definition query instead of view contents Can be used in queries just like a regular table 11/9/2018 Luke Huan Univ. of Kansas

7 Creating and dropping views
Example: EECS647roster CREATE VIEW EECS647Roster AS SELECT SID, name, age, GPA FROM Student WHERE SID IN (SELECT SID FROM Enroll WHERE CID = ’EECS647’); To drop a view DROP VIEW view_name; Called “base tables” 11/9/2018 Luke Huan Univ. of Kansas

8 Luke Huan Univ. of Kansas
Using views in queries Example: find the average GPA of EECS647 students SELECT AVG(GPA) FROM EECS647Roster; To process the query, replace the reference to the view by its definition SELECT AVG(GPA) FROM (SELECT SID, name, age, GPA FROM Student WHERE SID IN (SELECT SID FROM Enroll WHERE CID = ’EECS647’)); 11/9/2018 Luke Huan Univ. of Kansas

9 Luke Huan Univ. of Kansas
Why use views? To hide data from users To hide complexity from users Logical data independence If applications deal with views, we can change the underlying schema without affecting applications Recall physical data independence: change the physical organization of data without affecting applications To provide a uniform interface for different implementations or sources Real database applications use tons of views 11/9/2018 Luke Huan Univ. of Kansas

10 Luke Huan Univ. of Kansas
Modifying views Does not seem to make sense since views are virtual But does make sense if that is how users see the database Goal: modify the base tables such that the modification would appear to have been accomplished on the view Be careful! There may be one way to modify There may be many ways to modify There may be no way to modify 11/9/2018 Luke Huan Univ. of Kansas

11 Luke Huan Univ. of Kansas
A simple case CREATE VIEW StudentGPA AS SELECT SID, GPA FROM Student; DELETE FROM StudentGPA WHERE SID = 123; translates to: DELETE FROM Student WHERE SID = 123; 11/9/2018 Luke Huan Univ. of Kansas

12 Luke Huan Univ. of Kansas
An impossible case CREATE VIEW HighGPAStudent AS SELECT SID, GPA FROM Student WHERE GPA > 3.7; INSERT INTO HighGPAStudent VALUES(987, 2.5); No matter what you do on Student, the inserted row will not be in HighGPAStudent 11/9/2018 Luke Huan Univ. of Kansas

13 A case with too many possibilities
CREATE VIEW AverageGPA(GPA) AS SELECT AVG(GPA) FROM Student; Note that you can rename columns in view definition UPDATE AverageGPA SET GPA = 2.5; Set everybody’s GPA to 2.5? Adjust everybody’s GPA by the same amount? Just lower Lisa’s GPA? 11/9/2018 Luke Huan Univ. of Kansas

14 Luke Huan Univ. of Kansas
SQL92 updateable views More or less just single-table selection queries No join No aggregation No subqueries Arguably somewhat restrictive Still might get it wrong in some cases See the slide titled “An impossible case” Adding WITH CHECK OPTION to the end of the view definition will make DBMS reject such modifications 11/9/2018 Luke Huan Univ. of Kansas

15 Luke Huan Univ. of Kansas
Indexes An index is an auxiliary persistent data structure Search tree (e.g., B+-tree), lookup table (e.g., hash table), etc. More on indexes in the second half of this course! An index on R.A can speed up accesses of the form R.A = value R.A > value (sometimes; depending on the index type) An index on ( R.A1, …, R.An ) can speed up R.A1 = value1 Æ … Æ R.An = valuen (R.A1 , … R.An) > (value1, …, valuen) (again depends) 11/9/2018 Luke Huan Univ. of Kansas

16 Examples of using indexes
SELECT * FROM Student WHERE name = ‘Kevin’ Without an index on Student.name: must scan the entire table if we store Student as a flat file of unordered rows With index: go “directly” to rows with name = ’Kevin’ Kevin Susan Bob Mary John name sid name age gpa 1234 John 21 3.5 1123 Mary 19 3.8 1011 Bob 22 2.6 1204 Susan 3.4 1306 Kevin 18 2.9 11/9/2018 Luke Huan Univ. of Kansas

17 Creating and dropping indexes
CREATE [UNIQUE] INDEX index_name ON table_name(column_name1, …, column_namen); With UNIQUE, the DBMS will also enforce that {column_name1, …, column_namen} is a key of table_name DROP INDEX index_name; Typically, the DBMS will automatically create indexes for PRIMARY KEY and UNIQUE constraint declarations 11/9/2018 Luke Huan Univ. of Kansas

18 Choosing indexes to create
More indexes = better performance? Indexes take space Indexes need to be maintained when data is updated Indexes have one more level of indirection Optimal index selection depends on both query and update workload and the size of tables 11/9/2018 Luke Huan Univ. of Kansas

19 Luke Huan Univ. of Kansas
Transactions A program may carry out many operations on the data retrieved from the database However, the DBMS is only concerned about what data is read/written from/to the database. database - a fixed set of named data objects (A, B, C, …) transaction - a sequence of read and write operations (read(A), write(B), …) DBMS’s abstract view of a user program 11/9/2018 Luke Huan Univ. of Kansas

20 Correctness criteria: The ACID properties
A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. I solation: Execution of one Xact is isolated from that of other Xacts. D urability: If a Xact commits, its effects persist.

21 An Example about SQL Transaction
Consider two transactions (Xacts): T1: BEGIN A=A+100, B=B END T2: BEGIN A=1.06*A, B=1.06*B END 1st xact transfers $100 from B’s account to A’s 2nd credits both accounts with 6% interest. Assume at first A and B each have $ What are the legal outcomes of running T1 and T2??? $1100 *1.06 = $1166 There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. But, the net effect must be equivalent to these two transactions running serially in some order. 11/9/2018 Luke Huan Univ. of Kansas

22 Luke Huan Univ. of Kansas
Example (Contd.) Legal outcomes: A=1166,B=954 or A=1160,B=960 Consider a possible interleaved schedule: T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B This is OK (same as T1;T2). But what about: T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B Result: A=1166, B=960; A+B = 2126, bank loses $6 The DBMS’s view of the second schedule: T1: R(A), W(A), R(B), W(B) T2: R(A), W(A), R(B), W(B) 11/9/2018 Luke Huan Univ. of Kansas

23 Luke Huan Univ. of Kansas
SQL transactions Syntax in pgSQL: BEGIN <database operations> COMMIT [ROLLBACK] A transaction is automatically started when a user executes an SQL statement (begin is optional) Subsequent statements in the same session are executed as part of this transaction Statements see changes made by earlier ones in the same transaction Statements in other concurrently running transactions do not see these changes COMMIT command commits the transaction (flushing the update to disk) ROLLBACK command aborts the transaction (all effects are undone) 11/9/2018 Luke Huan Univ. of Kansas

24 Luke Huan Univ. of Kansas
Atomicity Partial effects of a transaction must be undone when User explicitly aborts the transaction using ROLLBACK E.g., application asks for user confirmation in the last step and issues COMMIT or ROLLBACK depending on the response The DBMS crashes before a transaction commits Partial effects of a modification statement must be undone when any constraint is violated However, only this statement is rolled back; the transaction continues How is atomicity achieved? Logging (to support undo) 11/9/2018 Luke Huan Univ. of Kansas

25 Luke Huan Univ. of Kansas
Isolation Transactions must appear to be executed in a serial schedule (with no interleaving operations) For performance, DBMS executes transactions using a serializable schedule In this schedule, only those operations that can be interleaved are executed concurrently Those that can not be interleaved are in a serialized way The schedule guarantees to produce the same effects as a serial schedule 11/9/2018 Luke Huan Univ. of Kansas

26 Luke Huan Univ. of Kansas
SQL isolation levels Strongest isolation level: SERIALIZABLE Complete isolation SQL default Weaker isolation levels: REPEATABLE READ, READ COMMITTED, READ UNCOMMITTED Increase performance by eliminating overhead and allowing higher degrees of concurrency Trade-off: sometimes you get the “wrong” answer 11/9/2018 Luke Huan Univ. of Kansas

27 Luke Huan Univ. of Kansas
READ UNCOMMITTED Can read “dirty” data A data item is dirty if it is written by an uncommitted transaction Problem: What if the transaction that wrote the dirty data eventually aborts? Example: wrong average -- T1: T2: UPDATE Student SET GPA = 3.0 WHERE SID = 142; SELECT AVG(GPA) FROM Student; ROLLBACK; COMMIT; Doesn’t lock Or short-duration read/write lock: lock and release In practice, probably just acquire latches on pages to prevent reading physically corrupted data 11/9/2018 Luke Huan Univ. of Kansas

28 Luke Huan Univ. of Kansas
READ COMMITTED No dirty reads, but non-repeatable reads possible Reading the same data item twice can produce different results Example: different averages -- T1: T2: SELECT AVG(GPA) FROM Student; UPDATE Student SET GPA = 3.0 WHERE SID = 142; COMMIT; SELECT AVG(GPA) FROM Student; COMMIT; Read Lock the stuff before access, release lock right afterwards Write lock are long-duration: hold until the end (to prevent read uncommitted) 11/9/2018 Luke Huan Univ. of Kansas

29 Luke Huan Univ. of Kansas
REPEATABLE READ Reads are repeatable, but may see phantoms Example: different average (still!) -- T1: T2: SELECT AVG(GPA) FROM Student; INSERT INTO Student VALUES(789, ‘Nelson’, 10, 1.0); COMMIT; SELECT AVG(GPA) FROM Student; COMMIT; Lock the stuff I access, and don’t release until I finish To fix: you can lock entire table, but that’s bad; => lock things that don’t exist yet to prevent insert 11/9/2018 Luke Huan Univ. of Kansas

30 Summary of SQL isolation levels
Isolation level/anomaly Dirty reads Non-repeatable reads Phantoms READ UNCOMMITTED Possible READ COMMITTED Impossible REPEATABLE READ SERIALIZABLE Syntax: At the beginning of a transaction, SET TRANSACTION ISOLATION LEVEL isolation_level [READ ONLY|READ WRITE]; READ UNCOMMITTED can only be READ ONLY 11/9/2018 Luke Huan Univ. of Kansas

31 Luke Huan Univ. of Kansas
Recursion in SQL SQL2 had no recursion You can find Bart’s parents, grandparents, great grandparents, etc. SELECT p1.parent AS grandparent FROM Parent p1, Parent p2 WHERE p1.child = p2.parent AND p2.child = ’Bart’; But you cannot find all his ancestors with a single query SQL3 introduces recursion WITH clause Implemented in DB2 (called common table expressions) 11/9/2018 Luke Huan Univ. of Kansas

32 Luke Huan Univ. of Kansas
Ancestor query in SQL3 WITH Ancestor(anc, desc) AS ((SELECT parent, child FROM Parent) UNION (SELECT a1.anc, a2.desc FROM Ancestor a1, Ancestor a2 WHERE a1.desc = a2.anc)) SELECT anc FROM Ancestor WHERE desc = ’Bart’; base case Define a a relation recursively recursion step Query using the relation defined in WITH clause How do we compute such a recursive query? 11/9/2018 Luke Huan Univ. of Kansas

33 Summary of SQL features covered so far
Query Modification Constraints Triggers Views Transaction Recursion Next: Database programming 11/9/2018 Luke Huan Univ. of Kansas


Download ppt "EECS 647: Introduction to Database Systems"

Similar presentations


Ads by Google